MiMo-7B: 7B Parameter Reasoning LLM Outperforms 32B Models

2025-04-30
MiMo-7B: 7B Parameter Reasoning LLM Outperforms 32B Models

Xiaomi introduces MiMo-7B, a 7-billion parameter language model designed for reasoning. Through optimized pre-training data and strategies, along with innovative reinforcement learning techniques, MiMo-7B demonstrates exceptional performance on math and code reasoning tasks, surpassing even larger 32B parameter models. The open-sourced model includes checkpoints for the base model, SFT model, and RL-trained models, offering valuable resources for developing powerful reasoning LLMs.