SourceHut Fights Back Against Aggressive LLM Scraping

2025-04-15

SourceHut, a platform dedicated to serving open-source software, is actively fighting back against aggressive data scraping by large language models (LLMs). They argue that LLM companies are not entitled to their users' data and have explicitly stated they will not make data-sharing arrangements with any company, even if paid. SourceHut has deployed Anubis to protect its services and updated its terms of service to strictly limit data scraping, permitting only uses such as search engine indexing, open-access research, and archiving. They emphasize that the data belongs to their users and their responsibility is to ensure the data is used in the best interests of their users, not for commercial profit or training LLM models.

Read more
Development