Ever since Google released this blog post about its image recognition software and its source code two weeks later, the internet has been flooded with a nightmarish collection of psychedelic images filled with dogs and eyes. So, what is Deep Dream?
Over the last few decades, computers have gotten very good at accomplishing specialized tasks that have a well-defined and narrow objective – like beating Garry Kasparov at chess. However, generalised intelligence of the sort that humans find easy still remains out of grasp. For example, it is still hard to program systems to perform seemingly trivial tasks such as determining the subject of a photograph. In an attempt to create such a generalized image recognition software, researchers at Google used artificial neural networks, a machine learning model commonly used for object and speech recognition.
An artificial neural network is a network of interconnected nodes arranged in layers with the output of each layer serving as an input to the next. The input, for example, an image, is presented to the input layer and each subsequent layer further processes the input, finally presenting an output, object categories in this case, at the output layer. During the training phase, this output is compared to the desired output and the weights of the connections between the nodes are modified to minimize the difference between the output and the desired output. For training, the researchers first classified a database of 1.2 million images of common objects like dogs, bananas, buildings, humans, etc into thousand categories and presented them to the network.
While this is the standard approach, we don’t know what goes on in the “hidden layers”, i.e, what are the key features that are recognised as, say, a fork. It is commonly believed that the lower layers extract simpler features such as edges and corners whereas the higher levels look for overall shapes. These are believed to be assembled and interpreted by the final few layers. To test this, researchers at Google flipped the network upside down and visualised the network’s representation of various objects. If they asked the network to find dogs in an image, the network identifies the smallest resemblance to dogs and modifies the image to show the dog-like features. This is then fed back into the network and the dog-like features are amplified even more until you obtain what the neural network sees as a dog. Deep Dream tries to identify all the objects it has been trained to recognise in an image and modifies the image to amplify them.
Visualising the network’s representation of, say, a dog, can be really useful as we can check if the network has learnt the right features. For example, the network’s representation of dumbbells always has a disembodied hand holding it. By this, we know that the network has failed to distill the correct features associated with a dumbbell, and we can then correct the training process so computers are better at recognizing images.
So yes, androids do dream, of electric sheep and hypnotic landscapes with hybrid animals merging into each other. Here, we take you on a photo tour of the campus dream scape.
Let us enter the campus dream scape
Birds often hide among the foliage but not from Google Deep Dream
Look who’s flicking the ball
The Central Library as seen by Google Deep Dream
Google Deep Dream thinks that Gajendra is a dog
We have been seeing weird singers at Saarang lately
Anything can happen in the darkness
Didn’t you know? Butterflies often metamorphose into birds
It is frightening to walk around IIT Madras at night
Of flying dogs and wild boars
Now, that is a very strange insect
This is weirder than the previous one
You will never burst a bubble with your hands again
Some more strange animals
It is all monkey business
Your escape route from this madness
PS: You can generate your own Deep Dream images here or here. These are the deep dream engines we used to generate our own images. Beware, some of the images generated are really gross. Take a look at Death by Cheese.