Wikimedia's Infrastructure Under Siege: The AI Data Scraping Tsunami

2025-05-02
Wikimedia's Infrastructure Under Siege: The AI Data Scraping Tsunami

Since early 2024, demand for Wikimedia's content, particularly the 144 million images and files on Wikimedia Commons, has skyrocketed. This surge is driven by AI models training on the open data, leading to a 50% increase in bandwidth usage from scraping bots. This unprecedented load strains Wikimedia's infrastructure, causing slowdowns and escalating costs. A shocking 65% of expensive traffic originates from bots, disproportionate to their 35% share of overall page views. Wikimedia calls for responsible data usage, urging developers to utilize supported access channels to ensure the sustainability of its free knowledge resources.

Tech