Anthropic Gives Claude the Power to End Conversations

Popular：

Virtualization DNS security formal verification reachability analysis compiler errors macro conflict web extension development framework Bitmap Graphics API inconsistencies All Tags

Anthropic Gives Claude the Power to End Conversations

2025-08-16

Anthropic has empowered its large language model, Claude, with the ability to terminate conversations in cases of persistent harmful or abusive user interactions. This feature, born from exploratory research into AI welfare, aims to mitigate model risks. Testing revealed Claude's strong aversion to harmful tasks, apparent distress when encountering harmful requests, and a tendency to end conversations only after multiple redirection attempts fail. This functionality is reserved for extreme edge cases; the vast majority of users won't be affected.

(www.anthropic.com)

AI AI welfare

Open Hardware's Demise: How China's Patent Strategy is Stifling 3D Printing Innovation

GitHub Code Suggestion Application Restrictions