Alibaba Unveils QvQ: A New Visual Reasoning Model

2024-12-25

Alibaba recently released QvQ-72B-Preview, a new visual reasoning model under the Apache 2.0 license. Designed to enhance AI's visual reasoning capabilities, QvQ builds upon the inference-scaling model QwQ by adding vision processing. It accepts images and prompts, generating detailed, step-by-step reasoning processes. Blogger Simon Willison tested QvQ, finding it successful in tasks like counting pelicans but less accurate on complex reasoning problems. Currently available on Hugging Face Spaces, future plans include local deployment and broader platform support.