SmolGPT: A Minimal PyTorch Implementation for Training Small LLMs

2025-01-29
SmolGPT: A Minimal PyTorch Implementation for Training Small LLMs

SmolGPT is a minimal PyTorch project designed for educational purposes, allowing users to train their own small language models (LLMs) from scratch. It features a modern architecture incorporating Flash Attention, RMSNorm, and SwiGLU, along with efficient sampling techniques. The project provides a complete training pipeline, pre-trained model weights, and text generation examples, making it easy to learn about and experiment with LLM training.

Development LLM training