MiniMax-M1: A 456B Parameter Hybrid-Attention Reasoning Model

Popular：

Virtualization DNS security formal verification reachability analysis compiler errors macro conflict web extension development framework Bitmap Graphics API inconsistencies All Tags

MiniMax-M1: A 456B Parameter Hybrid-Attention Reasoning Model

2025-06-18

MiniMax-M1, a groundbreaking open-weight, large-scale hybrid-attention reasoning model, boasts 456 billion parameters. Powered by a hybrid Mixture-of-Experts (MoE) architecture and a lightning attention mechanism, it natively supports a context length of 1 million tokens. Trained using large-scale reinforcement learning, MiniMax-M1 outperforms other leading models like DeepSeek R1 and Qwen3-235B on complex tasks, particularly in software engineering and long-context understanding. Its efficient test-time compute makes it a strong foundation for next-generation language model agents.

(github.com)

AI Hybrid Attention

The Cognitive Cost of LLMs: A Study on Essay Writing

ChatGPT in Education: A Double-Edged Sword