Neat and interesting read from Nautilus
A few months ago, my aunt sent her colleagues an email with the subject, “Math Problem! What is the answer?” It contained a deceptively simple puzzle.
She thought her solution was obvious. Her colleagues, though, were sure their solution was correct—and the two didn’t match. Was the problem with one of their answers, or with the puzzle itself?
My aunt and her colleagues had stumbled across a fundamental problem in machine learning, the study of computers that learn. Almost all of the learning we expect our computers to do—and much of the learning we ourselves do —is about reducing information to underlying patterns, which can then be used to infer the unknown. Her puzzle was no different.
As a human, the challenge is to find any pattern at all. Of course, we have intuitions that limit our guesses. But computers have no such intuitions. From a computer’s standpoint, the difficulty in pattern recognition is one of surplus: with an endless variety of patterns, all technically valid, what makes one “right” and another “wrong?”
The problem only recently became of practical concern. Before the 1990s, AI systems rarely did much learning at all. For example, the chess-playing Deep Thought, predecessor to Deep Blue, didn’t get good at chess by learning from successes and failures. Instead, chess grandmasters and programming wizards carefully crafted rules to teach it which board positions were good or bad. Such extensive hand-tuning was typical of that era’s “expert systems” approach.
To tackle my aunt’s puzzle, the expert systems approach would need a human to squint at the first three rows and spot the following pattern:
1 * (4 + 1) = 5
2 * (5 + 1) = 12
3 * (6 + 1) = 21
The human could then instruct the computer to follow the pattern x * (y + 1) = z. Applying this rule to the final line yields a solution of 96.
Despite expert systems’ early success, the manual labor required to design, tune, and update them became unwieldy. Instead, researchers turned their attention to designing machines that could infer patterns on their own. A program could inspect, say, thousands of photos or market transactions and tease out statistical signals suggesting a face or an impending price spike. This approach quickly came to dominate, and has since powered everything from automated postal sorting to spam filtering to credit card fraud detection.
And yet. With all their successes, these machine learning systems still needed engineers in the loop. Consider again my aunt’s puzzle. We assumed that each line has three relevant components (the three numbers in the line). But there’s a potential fourth element: the result on the previous line. If that attribute of a line—that feature, in machine learning parlance—is in bounds, then another plausible pattern emerges:
0 + 1 + 4 = 5
5 + 2 + 5 = 12
12 + 3 + 6 = 21
By this logic the final answer should be 40.
So which pattern is right? Both, of course—and neither. It all depend on which patterns are allowed. You could also find a pattern by taking the first number times the second number, adding one-fifth of three more than the previous answer, and rounding to the nearest integer. (It’s weird, but it works!) And if we allow features that consider the visual forms of the numbers, perhaps we could come up with some pattern involving strokes and serifs.