§ feed · storyline

Evolved Policy Gradients

OpenAI releases Evolved Policy Gradients, a metalearning method that evolves loss functions for learning agents, enabling faster training and generalisation to novel tasks outside their training regime.

Apr 18 · 09:00:00 · primary fetch1 sourceupdated Apr 18 · 09:00:00

We’re releasing an experimental metalearning approach called Evolved Policy Gradients, a method that evolves the loss function of learning agents, which can enable fast training on novel tasks. Agents trained with EPG can succeed at basic tasks at test time that were outside their training regime, like learning to navigate to an object on a different side of the room from where it was placed during training.

read full article on openai.com ↗

§ sources1 publication · timeline below

openai.comEvolved Policy Gradientsprimary09:00:00