Devstral: Open-Source LLM Outperforms GPT-4.1-mini on Software Engineering Benchmark
2025-05-21

Mistral AI and All Hands AI have collaborated to release Devstral, an agentic large language model (LLM) for software engineering tasks. Devstral excels on the SWE-Bench Verified benchmark, achieving a score exceeding 46.8%, more than 6% higher than previous open-source models and even surpassing GPT-4.1-mini. It tackles complex software engineering problems, such as understanding contextual relationships within large codebases and identifying subtle bugs. Devstral is lightweight, running on a single RTX 4090 or a Mac with 32GB RAM, and supports local deployment, enterprise use, and Copilot integration. The model is open-source and available via API and various download options.
Development