Dive Brief:
- Abstract reasoning is a critical component to the development of artificial intelligence systems with capabilities on par with humans. DeepMind created "visual IQ tests" to assess AI models' ability to find solutions to abstract reasoning tasks given the proper training data and also to generalize those abilities in situations where training and test data are different, according to research published by the Google subsidiary Wednesday.
- The models had to solve input from new stimuli with arithmetic progressions and logical operations, analyzing factors such as color, size and quantity, to learn and execute abstract reasoning, according to the report.
- With some caveats, neural networks were able to learn and infer, although performance across types of models was not consistent. Models approached questions with uniformity in perception and structure, performing well with known attribute values and content, even in unfamiliar combinations. But when extrapolating to input outside of a knowledge base or working with completely new situations, model performance was poor.
Dive Insight:
While the testing of these models took place in a "highly constrained" world with solutions that would not "match those applied by successful humans," creating benchmarks for AI is an important part of the development process, taking stock of where progress is now and what can be improved.
DeepMind researchers, in addition to making the abstract reasoning challenge available to the AI community, plan to continue studying solutions generated by some models and improving generalization capabilities.
Abstract reasoning is not the only uphill battle for AI trying to match human intelligence and capabilities. With the challenge of linguistic ambiguity, teaching computers to correctly interpret context in communication is a complex and difficult feat. For example, a computer must learn to distinguish the different meanings between "eating spaghetti with cheese" and "eating spaghetti with dogs."
But Siri and Google Assistant can solve any problem, right?
Intelligent assistants, one of the most obvious and common AI tools consumers interact with, increasingly rival and sometimes outperform human abilities. But an AI tool passing a Turing test (though questions of whether that has actually happened yet abound) doesn't necessarily mean the tool has humanlike analytical reasoning capabilities.
Like a human IQ test, where too much preparation can skew results, testing the capabilities of neural nets can be tricky to assess "given their striking capacity for memorization and ability to exploit superficial statistical cues," according to DeepMind researchers. Intelligent assistants have been fed mountains of data to help consumers in almost every conceivable area, yet when presented with unknown problems can still fall short.