§ feed · storyline
OpenAI and Anthropic share findings from a joint safety evaluation
OpenAI and Anthropic publish joint safety evaluation findings, testing each other's models for misalignment, hallucinations, and jailbreaking in a cross-lab collaboration.
OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing each other’s models for misalignment, instruction following, hallucinations, jailbreaking, and more—highlighting progress, challenges, and the value of cross-lab collaboration.
§ sources1 publication · timeline below