Dive Brief:
- After just five hours of training, an AI algorithm was able to find "novel solutions that exploit the game design" in Atari video games, according to a report published last week by researchers from the University of Freiburg in Germany earlier.
- In the game "Qbert," the model used "suicide" as a way to bait threats into eliminating themselves and found a bug, which allowed it to collect large amounts of points without progressing to the next level of the game. These results, however, were not consistent across all evaluation runs.
- Using basic Evolution Strategies algorithms from the 1970s, the researchers beat the performance of reinforcement learning algorithms from a 2017 study that also played Atati games. The researchers found that natural evolution strategies could offer a good alternative for reinforcement learning over many modern deep reinforcement learning approaches. A combination could capitalize on the strengths of both methodologies.
Dive Insight:
It is important to note that the algorithm was not actively seeking out weaknesses to exploit in the game. It simply found a more efficient way to win and rack up points.
AI algorithms surpassing human performance, especially in cases of intellectual or strategic endeavors, tend to set off a fresh wave of skepticism and fears. In 2017, an AI system beating the world's best Go player made an especially large splash.
But the more important feature of this study was the use of 1970s Canonical ES algorithms. Technology, especially in a fast-paced environment like AI, can move at such breakneck speeds that what's in one year is out the next.
Relying and returning to the basics and applying them to a modern context can offer developers a fresh take.