Perfectly Aligning AI’s Values With Humanity’s Is Impossible
One of the hardest problems in artificial intelligence is “alignment,” or making sure AI goals match our own, a challenge that may prove especially important if superintelligent AIs that outmatch us intellectually are ever developed. But scientists in England and their colleagues now report in the journal PNAS Nexus that perfect alignment between AI systems and human interests is mathematically impossible. All may not be lost, the scientists say. To cope with this impossibility, they suggest a strategy involving pitting AI systems with different modes of reasoning and partially overlapping goals against each other.
As the AI systems attempt to meet their personal objectives in this “cognitive ecosystem” instilled with “artificial neurodivergence,”, they will dynamically help or hinder each other, preventing dominance by any single AI. We spoke with Hector Zenil, associate professor of healthcare and biomedical Engineering at King’s College London, about his and his colleagues’ work on alignment’s limits and its future. IEEE Spectrum: How did you first become interested in the question of alignment? Zenil: I became interested because too much of the alignment discussion was framed…
- spectrum.ieee.orgPerfectly Aligning AI’s Values With Humanity’s Is Impossibleprimary