§ feed · storyline
Our approach to alignment research
Anthropic outlines its alignment research approach, focusing on learning from human feedback and building AI systems capable of assisting humans in evaluating and solving further alignment problems.
We are improving our AI systems’ ability to learn from human feedback and to assist humans at evaluating AI.
Our goal is to build a sufficiently aligned AI system that can help us solve all other alignment problems.
§ sources1 publication · timeline below