Google (NASDAQ: GOOG) specialists have created a machine called PlaNet, that can identify the location of almost any photograph using only the pixels it contains, even if they were taken indoors, said the MIT’s Technology Review.
PlaNet outperforms humans and it can even recognize the location of indoor photographs and specific things such as pets or food. To create the machine, the team lead by Tobias Weyand, a computer vision specialist at Google, took data from more than 126 million geolocated images from the Web.
The team then took million of pictures to teach a neural network to identify the location using just the image itself. They explain in their abstract about the machine that it can be difficult for a system to determine the location of a picture just by using pixels.
Usually, cues such as landmarks, weather patterns, vegetation, road markings and architectural details can provide information about approximate or exact locations when combined together. Currently, websites like GeoGuessr and View from your Window, challenge humans to determine where a picture was taken, by showing those cues.
In order to improve location recognition, the Google team created a machine that is able to use and integrate multiple visible cues. It can even outperform previous approaches and human levels of accuracy, wrote the team.
“In contrast, we pose the problem as one of classification by subdividing the surface of the earth into thousands of multi-scale geographic cells and train a deep network using millions of geotagged images” added the team.
It has learned different scenes that are hard to recognize even for humans.
Results would appear to show that PlaNet achieves a 50% better performance when compared to other models of location recognition. When the team conducted trials with geotagged Flickr images, the system assigned 48 percent of them to the right continent, 28.4 to the right country, 10.1 percent to the right city, and 3.6 percent to the actual street, said Technology Review.
It seems interesting that PlaNet does not analyze the same cues than humans, but it still has a “super-human” performance. Tobias Weyand thinks that the system has an advantage over humans because it has seen many more places than any human can ever visit, and it has learned different scenes that are hard to recognize even for humans.
The system can also determine the location of some pictures taken indoors, that show no location cues, but just when images are part of an album that has other pictures containing similar information.
There is not information whether PlaNet will be released for users, however, researchers explained that the model uses only 377 MB, which is a small amount of memory when compared to other systems that use gigabytes of information.
Source: MIT Tech Review