§ feed · storyline
Learning to summarize with human feedback
OpenAI applies reinforcement learning from human feedback to train language models that produce more accurate text summaries.
We’ve applied reinforcement learning from human feedback to train language models that are better at summarization.
§ sources1 publication · timeline below