QVQ-Max: An AI Model with Both Vision and Intellect

2025-04-06
QVQ-Max: An AI Model with Both Vision and Intellect

QVQ-Max is a novel visual reasoning model that not only 'understands' images and videos but also analyzes and reasons with this information to solve various problems. From math problems to everyday questions, from programming code to artistic creation, QVQ-Max demonstrates impressive capabilities. It excels at detailed observation, deep reasoning, and flexible application in various scenarios, such as assisting with work, learning, and daily life. Future development will focus on improving recognition accuracy, enhancing multi-step task handling, and expanding interaction methods to become a truly practical visual agent.

Read more

Qwen2.5-VL-32B: A 32B Parameter Visual-Language Model That's More Human-Friendly

2025-03-24
Qwen2.5-VL-32B: A 32B Parameter Visual-Language Model That's More Human-Friendly

Following the widespread acclaim of the Qwen2.5-VL series, we've open-sourced the new 32-billion parameter visual-language model, Qwen2.5-VL-32B-Instruct. This model boasts significant improvements in mathematical reasoning, fine-grained image understanding, and alignment with human preferences. Benchmarking reveals its superiority over comparable models in multimodal tasks (like MMMU, MMMU-Pro, and MathVista), even outperforming the larger 72-billion parameter Qwen2-VL-72B-Instruct. It also achieves top-tier performance in pure text capabilities at its scale.

Read more

QwQ-32B: Scaling RL for Enhanced Reasoning in LLMs

2025-03-05
QwQ-32B: Scaling RL for Enhanced Reasoning in LLMs

Researchers have achieved a breakthrough in scaling reinforcement learning (RL) for large language models (LLMs). Their 32-billion parameter QwQ-32B model demonstrates performance comparable to the 671-billion parameter DeepSeek-R1 (with 37 billion activated parameters), highlighting the effectiveness of RL applied to robust foundation models. QwQ-32B, open-sourced on Hugging Face and ModelScope under the Apache 2.0 license, excels in math reasoning, coding, and general problem-solving. Future work focuses on integrating agents with RL for long-horizon reasoning, pushing the boundaries towards Artificial General Intelligence (AGI).

Read more
AI

Alibaba Unveils Qwen2.5-Max: A Massive MoE Language Model

2025-01-28
Alibaba Unveils Qwen2.5-Max: A Massive MoE Language Model

Alibaba has released Qwen2.5-Max, a large-scale Mixture-of-Experts (MoE) model pre-trained on over 20 trillion tokens and further refined with supervised fine-tuning and reinforcement learning from human feedback. Benchmarks like MMLU-Pro, LiveCodeBench, LiveBench, and Arena-Hard show Qwen2.5-Max outperforming models such as DeepSeek V3. The model is accessible via Qwen Chat and an Alibaba Cloud API. This release represents a significant advancement in scaling large language models and paves the way for future improvements in model intelligence.

Read more

Qwen2.5-1M: Open-Source LLMs with 1 Million Token Context Length

2025-01-26
Qwen2.5-1M: Open-Source LLMs with 1 Million Token Context Length

The Qwen team released Qwen2.5-1M, open-source large language models supporting up to one million tokens of context length, in 7B and 14B parameter versions. These models significantly outperform their 128K counterparts on long-context tasks, even surpassing GPT-4o-mini in some cases. An open-sourced inference framework based on vLLM, leveraging sparse attention for a 3x to 7x speed boost, is also provided for efficient deployment. Qwen2.5-1M's training employed a progressive approach, incorporating Dual Chunk Attention (DCA) and sparse attention techniques for effective long-context handling.

Read more
AI