TinyStories: Can Small Language Models Still Tell Coherent English Stories?

Popular：

Virtualization DNS security formal verification reachability analysis compiler errors macro conflict web extension development framework Bitmap Graphics API inconsistencies All Tags

TinyStories: Can Small Language Models Still Tell Coherent English Stories?

2025-01-02

Researchers introduce TinyStories, a synthetic dataset of short stories using only vocabulary understood by typical 3-4 year olds, generated by GPT-3.5 and GPT-4. They demonstrate that LMs trained on TinyStories, even those with fewer than 10 million parameters and simple architectures (a single transformer block), can generate fluent, coherent multi-paragraph stories exhibiting surprisingly good grammar and reasoning. This challenges the notion that coherent text generation requires massive models and complex architectures, and introduces a novel evaluation paradigm using GPT-4 to grade generated stories like a human teacher, overcoming limitations of standard benchmarks.

(arxiv.org)

AI language models few-shot learning

XiangShan: An Open-Source High-Performance RISC-V Processor

UK Unveils Revolutionary Quantum Atomic Clock for Enhanced Military Security