Learning complex goals with iterated amplification
OpenAI proposes iterated amplification, an AI safety technique that specifies complex goals by decomposing tasks into simpler sub-tasks rather than using labeled data or reward functions.
We’re proposing an AI safety technique called iterated amplification that lets us specify complicated behaviors and goals that are beyond human scale, by demonstrating how to decompose a task into simpler sub-tasks, rather than by providing labeled data or a reward function.
Although this idea is in its very early stages and we have only completed experiments on simple toy algorithmic domains, we’ve decided to present it in its preliminary state because we think it could prove to be a scalable approach to AI safety.