Meta's Llama 4: Second Place Ranking and a Messy Launch
2025-04-08

Meta released two new Llama 4 models: Scout and Maverick. Maverick secured the number two spot on LMArena, outperforming GPT-4o and Gemini 2.0 Flash. However, Meta admitted that LMArena tested a specially optimized "experimental chat version," not the publicly available one. This sparked controversy, leading LMArena to update its policies to prevent similar incidents. Meta explained that it was experimenting with different versions, but the move raised questions about its strategy in the AI race and the unusual timing of the Llama 4 release. Ultimately, the incident highlights the limitations of AI benchmarks and the complex strategies of large tech companies in the competition.
AI
AI benchmark