shipfeedAI news, curated daily

23:04:01 CET
20 MAY23:04:01shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

Learning from human preferences

OpenAI and DeepMind publish a reinforcement learning algorithm that infers human preferences by asking humans to compare pairs of proposed behaviors, removing the need to hand-code goal functions.

Jun 13 · · primary fetch1 sourceupdated Jun 13 ·

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior.

In collaboration with DeepMind’s safety team, we’ve developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better.

read full article on openai.com
§ sources1 publication · timeline below
  1. openai.comLearning from human preferencesprimary