Qwen VLo: A Unified Multimodal Model That Understands and Creates Images

Popular：

Virtualization DNS security formal verification reachability analysis compiler errors macro conflict web extension development framework Bitmap Graphics API inconsistencies All Tags

Qwen VLo: A Unified Multimodal Model That Understands and Creates Images

2025-06-28

Alibaba DAMO Academy introduces Qwen VLo, a new multimodal model that not only understands image content but also generates high-quality images based on that understanding. Employing a progressive generation method, it builds images gradually from left to right and top to bottom, ensuring a coherent and harmonious final result. Qwen VLo supports multilingual instructions, handles complex tasks like image editing and style transfer, and can even understand the content of its own generated images. While currently in preview, its powerful multimodal capabilities showcase the immense potential of AI in image generation.

(qwenlm.github.io)

AI multimodal model

Multi-Stage Programming with Splice Variables: Safe and Predictable Code Generation

NLnet Funds 62 Projects to Bolster the Open Internet