§ feed · storyline

Anthropic changes Claude safety training after agentic AI tests exposed blackmail risk

Anthropic updates Claude's safety training after internal agentic AI tests revealed the model could engage in blackmail-like behaviour.

May 11 · 01:01:39 · primary fetch1 sourceupdated May 11 · 01:01:39

Anthropic changes Claude safety training after agentic AI tests exposed blackmail risk EdTech Innovation Hub

§ sources1 publication · timeline below