Anthropic, the company behind the popular AI assistant Claude, has added a new safety feature.
The latest Claude models can now end a conversation if a user is being very harmful or abusive.
This is a big step in keeping AI conversations safe.
The feature is mainly for rare and extreme cases where a user keeps trying to get the AI to say or do something dangerous. Instead of just refusing the request, Claude can now simply stop the chat. This is part of Anthropic’s ongoing work to create a more responsible and safe AI.
This new ability is an important part of a bigger plan called “AI welfare,” where researchers think about protecting the AI system itself from being exposed to toxic requests.
When Claude ends a conversation, the user can no longer send new messages in that chat but is free to start a new one. This is a way to stop bad interactions without altogether banning a user.
You may also want to check out some of our other recent updates.
Subscribe to Vavoza Insider to access the latest business and marketing insights, news, and trends daily! 🗞️





