TinyStories: Can Small Language Models Still Tell Coherent English Stories?

2025-01-02

Researchers introduce TinyStories, a synthetic dataset of short stories using only vocabulary understood by typical 3-4 year olds, generated by GPT-3.5 and GPT-4. They demonstrate that LMs trained on TinyStories, even those with fewer than 10 million parameters and simple architectures (a single transformer block), can generate fluent, coherent multi-paragraph stories exhibiting surprisingly good grammar and reasoning. This challenges the notion that coherent text generation requires massive models and complex architectures, and introduces a novel evaluation paradigm using GPT-4 to grade generated stories like a human teacher, overcoming limitations of standard benchmarks.