§ feed · storyline
OpenAI Baselines: ACKTR & A2C
OpenAI releases two reinforcement learning baseline implementations, A2C and ACKTR, with ACKTR offering improved sample efficiency over TRPO and A2C at modest additional compute cost.
We’re releasing two new OpenAI Baselines implementations: ACKTR and A2C. A2C is a synchronous, deterministic variant of Asynchronous Advantage Actor Critic (A3C) which we’ve found gives equal performance.
ACKTR is a more sample-efficient reinforcement learning algorithm than TRPO and A2C, and requires only slightly more computation than A2C per update.
§ sources1 publication · timeline below
- openai.comOpenAI Baselines: ACKTR & A2Cprimary