The computer’s next conquest: crossword puzzles
Published 5:00 am Saturday, March 17, 2012
What is a 10-letter word for smarty pants?
This weekend the world may find out when computer technology again tries to best human brains, this time at the American Crossword Puzzle Tournament in Brooklyn, N.Y.
Computers can make mincemeat of chess masters and vanquish the champions of “Jeopardy!” The question is: Can the trophy go to a crossword-solving program, Dr. Fill — a wordplay on filling in a crossword (get it?) and the screen name of the talk show host Dr. Phil McGraw — when it tests its algorithms against the wits of 600 of the nation’s top crossword solvers?
DOCTOR FILL was created by Matthew Ginsberg, 56, chief executive of On Time Systems in Eugene. He holds a Ph.D. from Oxford, taught at Stanford and wrote a book on artificial intelligence. As a hobby, he also constructs crossword puzzles, including more than two dozen published in The New York Times.
The program has already excelled in most simulations of 15 past tournaments, finishing on top three times. Dr. Fill is a speed demon. It can successfully complete easier puzzles in a minute; even lightning-fast human solvers take about three minutes. Hard puzzles may take three minutes, about half as long as human whizzes.
Whatever Dr. Fill’s final ranking at the Brooklyn matchup, which ends Sunday, the program is an impressive achievement, experts say, and a sign of the times. In cerebral games, like chess, bridge, “Jeopardy!” and crossword puzzles, computers can now perform comparably to the top tier of human players — sometimes a bit better, but also sometimes a bit worse.
Humans and machines play the games very differently. Humans recognize patterns based on accumulated knowledge and experience, while computers make endless calculations to determine the most statistically probable answer.
“We’re at the point where the two approaches are about equal,” said Peter Norvig, a leading artificial intelligence expert, who is a research director at Google. “But people have real experience. A computer has a shadow of that experience.”
Also, people tend to have a sense of humor, and this turns out to be helpful.
Puzzle constructors sometimes put in answers not found in the dictionary. For example, in a puzzle with the theme of rabbits, the answer to famous bank robbers might be BUNNY AND CLYDE, Ginsberg said, which requires a little imagination.
Or take this clue from a 2010 puzzle in The Times: Apollo 11 and 12 (180 degrees). The answer is the letters SNOISSIWNOOW, seemingly gibberish. A clever human could eventually figure out that those letters when flipped 180 degrees, spell MOON MISSIONS.
This sort of thing requires imagination and creativity. Humans get the joke, while a literal-minded computer does not. “Occasionally, Dr. Fill just doesn’t get it,” Ginsberg said. “That’s my nightmare.”
At the tournament, players will get six puzzles to solve today, one at a time, and one on Sunday — progressively more difficult. Rankings are determined by accuracy and speed; contestants raise their hands when done, and roving referees mark the time.
The top three finishers enter a playoff round with an eighth puzzle on Sunday afternoon, competing for the $5,000 prize. All the contestants can try to solve the puzzle for fun. Game challenges, though, are not just fun and games, but serious science that has opened the door to practical applications.
“Games are a great motivator for artificial intelligence — they push things forward,” said David Ferrucci, the IBM researcher who led the development of Watson, the “Jeopardy!” computer champion. “But what really matters is where it is taking us.”
Watson, for example, is being adapted for business uses, first in health care to assist doctors in making diagnoses.
Ginsburg’s real job is chief executive of On Time Systems, in Eugene, whose software helps in tasks like calculating the most efficient flight paths for aircraft. The Air Force uses the programs for optimizing its noncombat flights, saving 20 million gallons of fuel a year, he said.
Some of the statistical techniques for figuring air routes are also handy for solving crossword puzzles. A typical puzzle might have 75 words, and up to 10,000 words in the dictionary with the same number of letters as each word in the space, down or across, for the answer.
To narrow its choices, Dr. Fill taps a database of millions of crossword answers and clues. If it spots a match, that is a sure thing.
If not, Dr. Fill calculates the 100 most probable answers, based on a number of factors, including how prevalent one of its millions of crossword-related words is in Google’s directory of the Web.
Dr. Fill can fill a puzzle in as little as five seconds, but then the program does fit and finish work. For example, its initial best guess for a five-letter word across might be BEZEL, Ginsberg explained. The Z, though, might conflict with a higher-probability answer in a crossing word, going down, which would put W in that space. So Dr. Fill would change BEZEL to JEWEL.
The time spent on juggling to find an optimal array of answers varies depending on the difficulty of the puzzle — typically from one to three minutes.
So how smart is Dr. Fill really?
Will Dr. Fill get wordplay?
“On the easier puzzles, I think Dr. Fill will kill the field,” said Will Shortz, the tournament director and crossword puzzle editor for The Times, who has seen a demonstration of Ginsberg’s program.
The real hurdle for Dr. Fill, and perhaps its comeuppance, will come from the harder puzzles, especially those with the tricky themes or wordplay, Shortz said.
Dr. Fill was flummoxed by a puzzle from a previous tournament that had the theme of spoonerisms — the switching of first letters in two words. So a clue might be heavy mist, and a logical answer would be LIGHT RAIN. But spoonerized, it becomes RIGHT LAIN. An expert human solver, Shortz said, can recognize the twist, see the pattern.
“You slap your head and say, ‘Oh, now I get it,’” Shortz said.
Not so for Dr. Fill, a bundle of computer code on a notebook computer. “It was totally adrift,” Ginsberg lamented.
In 1999, Michael Littman, then at Duke University, and a team of a dozen researchers created a program that would have finished 147th out of 255 contestants in that year’s tournament. “It did pretty well, but it wasn’t at championship level,” said Littman, a professor of computer science at Rutgers University.
Ginsberg expects to do better, probably in the top 50. With only eight puzzles, the range of possible outcomes is wide.
“If I’m lucky, I’ll win,” he said, “If I’m unlucky, I’ll end up 150th,” which would still be in top fourth among the 600 contestants.
Dan Feyer, an ace solver who has won the last two tournaments, is betting that Shortz, who commissions and edits the puzzles, will include one with a quirky, imaginative twist to try to stump the computer.
Shortz isn’t saying. But he is handing out buttons to anyone who finishes ahead of the computer: “I Beat Dr. Fill.” And he is making sure that even if Dr. Fill wins, he will not taste all the fruits of victory. The machine is not eligible for the $5,000 prize.
“The tournament is for humans,” Shortz said.