Anthropic's Claude Browser Extension: A Controlled Test for AI Safety
2025-08-27
Anthropic is testing a Chrome extension that allows its AI assistant, Claude, to interact directly within the browser. While this greatly enhances Claude's utility, it introduces significant safety concerns, primarily prompt injection attacks. Red-teaming experiments revealed a 23.6% attack success rate without mitigations. Anthropic implemented several safeguards, including permission controls, action confirmations, and advanced classifiers, reducing the success rate to 11.2%. Currently, the extension is in a limited pilot program with 1000 Max plan users to gather real-world feedback and improve safety before wider release.
AI