Facebook has improved the capabilities of its computers to better categorize objects in photos by studying 3.5 billion Instagram photos.
The social networking giant revealed its latest artificial intelligence project on Wednesday in San Jose, Calif., during the company’s annual F8 developer conference.
Facebook (FB, +1.23%) chief technology officer Mike Schroepfer explained to an audience of coders the challenges of improving the accuracy of computers to understand specific objects in photos. One of the biggest problems is not having enough properly labeled photos to train the computers to understand what is in them.
For example, before a computer can understand that an apple in a photo is indeed an apple, it needs to have been “trained” on previous photos of apples that humans accurately labeled with the right fruit.
Considering Facebook owns the popular Instagram photo sharing service, it makes sense that the company would want to use all of the data Instagram captures from photos people upload. It can then use those photos to improve its overall image-recognition capabilities.
Schroepfer said that Facebook took 3.5 billion Instagram photos, adorned with people’s hashtags to describe them, and was able to “produce state of the art results” on the popular ImageNet computer-vision benchmark, used by AI researchers as a gauge of their project’s effectiveness compared to others.
In an interview with Fortune prior to Facebook’s developer conference, Facebook’s head of applied computer vision Manohar Paluri said that one of the challenges training the company’s computers was that many of the Instagram photos had “noisy hashtags,” meaning that someone could have described a photo of a dog as a husky when in fact it was a different breed.
“The noise is all over the place,” Paluri said regarding the Instagram photos with inaccurate descriptions.
Once the computers were able to analyze the billions of photos, Facebook then essentially crosschecked the results using a popular linguistic database for the English language called WordNet, Paluri said.
In a research paper on the project, Facebook researchers said that by using WordNet, the company was able to group common hashtags with each other to cut down on the noise. For example, Instagram photos that came with the hashtag of “brown bear” were now associated with the photos that were labeled with the hashtag of the brown bear’s scientific name, “Ursus arctos arctos.”
As a result of this training process, Paluri said that Facebook’s computers could now distinguish between specific species of birds in photographs as well as the different weather conditions in those photos.
The system can now also tell the difference between different varieties of dosas, a popular crepe-like food, and recognize that the pancake like treats are part of the Indian cuisine, Paluri said.
The entire AI project took 22 days and required the power of 330 graphical processing units, or GPUs, which are good at these kinds of deep learning tasks.
Paluri said that with the improved ability to understand images, Facebook would be able to create more accurate audio descriptions of photos to its blind users.
Schroepfer said that the new AI image recognition improvements are already baked into various Facebook products.
“To be honest this is just the beginning,” Paluri said. “We are starting to learn from this data.”