Qwen2.5-1M: Open-Source LLMs with 1 Million Token Context Length

2025-01-26
Qwen2.5-1M: Open-Source LLMs with 1 Million Token Context Length

The Qwen team released Qwen2.5-1M, open-source large language models supporting up to one million tokens of context length, in 7B and 14B parameter versions. These models significantly outperform their 128K counterparts on long-context tasks, even surpassing GPT-4o-mini in some cases. An open-sourced inference framework based on vLLM, leveraging sparse attention for a 3x to 7x speed boost, is also provided for efficient deployment. Qwen2.5-1M's training employed a progressive approach, incorporating Dual Chunk Attention (DCA) and sparse attention techniques for effective long-context handling.

AI