Dive Brief:
- An algorithm is only as good as its training, and Google is turning to the public to help train its machine learning systems in image recognition, reports Wired. The technology company's Crowdsource site and application lets internet users verify image contents and upload photos of their own.
- Researchers hope that opening up training data from outside the U.S. and Europe will improve the algorithms, which can be limited to images depicting western culture and affluence, according to Anurag Batra, a Google researcher. A Bangalore-based team encourages use of the app around Asia, and Latin America is likely the next continent in line as the company looks to expand the crowdsourcing effort worldwide.
- Improving algorithm literacy for regions outside the U.S. is critical for Google as it builds out its global footprint in newer markets. It is not the company's first use of crowd-sourced data, however: Google also collects images and data from users through Google Maps and CAPTCHA.
Dive Insight:
Efforts to reduce bias in AI algorithms extend far beyond race and gender to culture, language and geography. An algorithm developed (even with the best of intentions) cannot always be transposed to a new place or context while maintaining effectiveness.
As companies integrate speech and image recognition into more technologies, from consumer handheld devices and voice assistants in offices to autonomous vehicles, a thorough and accurate data set is critical. Google is not the only technology company working on the issue either.
In December, IBM rolled out a collection of one-million video clips for non-commercial and educational use in AI training. The data set used three-second clips to reduce complexity, identifying a basic action like a yawn or hug which algorithms could then build off of when identifying more complex actions, such as changing a car tire.
Big companies with advanced AI units have a natural advantage in AI and ML given the troves of information they have on hand for algorithm training plus a global user base to tap into. Google, already a dominant global player in AI, is well positioned for future development with such programs.