I’ve looked at clouds from both sides now
From up and down and still somehow
It’s cloud illusions I recall
I really don’t know clouds at all.
-Joni Mitchell
I had a wonderful conversation with a colleague this week about machine learning. Not about the specific algorithms and mathematics, but about the philosophy that makes ML tick — the general approach that makes it work as well as it does.
“Maching learning”, as some of you know, is an approach to heuristic algorithms (sometimes known by the sexier term “artificial intelligence”). When a problem is too difficult for a computer to solve by straight ahead computation, sometimes we resort to sneakier methods — approaches that try to look for shortcuts to a solution, and usually (but not always) find them.
What’s generally called “cloud computing” — looking at lots of examples of “things like this” by sifting through large amounts of data, and then using those examples to make better guesses about new things — makes heavy use of such shortcuts. For example, if you want your machine learning algorithm to recognize faces, you can “train” it by showing it lots of examples of photos “in the cloud” that somebody has already labeled as pictures of faces.
The conversation I had this week was about something a little more subtle: The fact that machine learning usually works because it uses information about big things to figure out something about small things, but also information about small things to figure out something about big things.
For example, early techniques for recognizing faces usually started by looking at a low resolution version of a picture and saying “hey, here’s a fuzzy blob that might be a face.” Then it looked at a higher resolution version of the same picture to check for things like eyes, nose and mouth in the proper place.
This didn’t work very well, because in a low res picture there are lots of fuzzy blobs that might be a face, but when you look more closely, most of them turn out not to be faces. Machine learning ups the game by going in both directions at once.
Not only does it look for faces, and check whether there are eyes and noses and mouths inside, but it simultaneously looks for smaller features like eyes, noses and mouths, and checks to see whether they are inside bigger features that look like faces.
The big power-up here is that we’re checking both “big to small” and “small to big”, looking in particular for connections that work in both directions.
It seems pretty simple when you put it like that. Yet this simple change in thinking has had a huge impact on our ability to use computers to recognize things.