Facebook's Large Concept Models: Sentence-Level Language Modeling
2025-01-01
Facebook Research unveils Large Concept Models (LCMs), a novel approach to language modeling operating in a sentence representation space. Utilizing the SONAR embedding space, LCMs support up to 200 text languages and 57 speech languages. Treating sentences as 'concepts', LCMs employ a sequence-to-sequence model for autoregressive sentence prediction. The project provides recipes for training and fine-tuning 1.6B parameter models, exploring MSE regression and diffusion-based generation.