The future of talking, here soon?

Published 5:00 am Saturday, May 23, 2009

The future of talking, here soon?

The young American soldier recalled the time in Iraq he came across the badly burned little girl. He was on patrol. Trouble ahead. A house had been set on fire. In front of it was the girl, just standing there, all alone.

There he stood, helplessly, in full battle rattle, with his ballistic glasses and helmet, his weapon bristling, his body armor making him waddle like a bipedal rhino.

He spoke no Arabic. He couldn’t comfort her, he couldn’t tell her he wanted to get her medical help.

“I sure wish I’d had one of those,” he told Jennifer Gollob.

Gollob points to a machine that easily fits in a bag the size of a woman’s purse. It’s a universal translator. It is being tested in Iraq by DARPA — the Defense Advanced Research Projects Agency — the legendary research and development works in Arlington, Va., where Gollob is a contractor.

The machine interprets the spoken word. You talk in English. It repeats whatever you said in spoken Iraqi Arabic. It then awaits a spoken response from the Iraqi, and talks back to you in English.

It’s pretty good, says Mari Maeda, the program’s manager. About 70 or 80 percent accurate. Not as good as a human. But the number of human interpreters willing to work around gunfire is finite.

DARPA is aiming to get an affordable iPod-size interpreter on the chest of every American warrior, foreshadowing the day such devices will be as common as music players.

Independently, Google is deploying its strikingly successful Translate project. It instantly translates text among 41 languages from Bulgarian to Hindi with surprising felicity. The big question is how soon Google will release a voice version, making the world’s cell phones multilingual.

That sound you hear? It’s the sound, after all these millennia, of the Tower of Babel rising once again.

What is ‘good enough’?

On Jan. 7, 1954, IBM announced, with great fanfare: “Russian was translated into English by an electronic ‘brain’ today for the first time.” Routine machine translation, we were told, was only five years away.

Half a century later, computers have mastered challenges that impress even geneticists, chess grandmasters and research librarians. But machines still have the devil’s own time with routines common to any healthy 2-year-old. Becoming fluent with languages, for example.

To this day, if you want to get a translation absolutely right, go find yourself a talented human. “Nuclear power,” says Kevin Hendzel, a spokesman for the American Translators Association, when asked of areas where you want tremendously good human translation. “Negotiations for disarmament. The pharmaceutical industry. Zero-error work with millions of dollars” riding on the outcome. Hendzel has served as an interpreter on the presidential hot line.

The trouble with meticulous, culturally sensitive human translation, of course, is that it is slow, pricey and rare.

Suppose you are willing to settle for blazingly fast, cheap, “good enough” translations. Especially those aimed at languages spoken by the rich, multitudinous or dangerous. Enter the new generation of machine translators that in the last year have begun to open broad new vistas.

For decades, translation programs tried to be rules-based. Teach the machine that in English the adjective comes before the noun; in French it’s the reverse. Seems logical. But not only is it tedious and expensive to get a bunch of linguists to collect such intricacies, it produces laughable results. Just try Yahoo Babel Fish, for example. Language turns out not to be an Industrial Age machine of discrete parts.

One linguist, discussing the problem on the technology news Web site Slashdot, writes: “Parsing English is easy by comparison. I work with another language where there is a slight stress difference between the sentences ‘That might be true’ and ‘He’s honestly picking his butt.’ The words ‘soup’ and ‘(poop)’ are differentiated by a 40-50 percent increase in the length of the last vowel. There is one word for both ‘blue’ and ‘green,’ and another word for ‘yellow,’ ‘orange,’ and ‘brown.’ “

The explosion of the Web, however, has enabled a revolution. Like so many successful human approaches, it relies on brute force and ignorance. This method cares little for how any language works. It just looks — Rosetta stone fashion — at huge amounts of text translated into different languages by humans. (Dump decades of U.N. documents into the maw.) Then it lets the machine statistically express the probability that words in one language line up together in a fashion comparable to another set of words in another language.

For this statistical approach to work, of course, you need astounding computer power and zillions of pages of text.

Whom does this make you think of?

Google, perhaps?

This also means that the people who do the statistical approach do not talk about programming their software. They talk about “training” it.

Cue the spooky music.

Yes, we are creeping up on artificial intelligence here.

‘Who did what to whom’

“It is coming,” Peter Norvig says of the day when cell phones translate conversation. “We don’t announce things before their time. But there will be products coming out soon. The early generations will be only for the early adopters, and then later on it will reach the masses.”

Norvig is the director of research at Google, arguably the world’s leader in machine translation. “Certainly we’re the broadest. We have over 40 languages and we translate between all pairs of them … in any subject domain … and nobody else does that.”

Google still hires professional human translators to create high-value pages, like the ones in French telling people how to use Google. “It’s a matter of ownership,” he says of taking pride in presentation.

But Norvig refers to professional human translators as “a small guild” carving up a market of a few billion dollars. With Google Translate, he’s talking about making billions of routine pages more available than ever for billions of ordinary people.

“I think most of the time now, you take a newspaper article” and run it through Translate “and you can understand what’s going on. It will be very rare that you think a native speaker did the translation. You’ll notice disfluencies in every sentence. But you’ll know who did what to whom.”

Indeed, on “Meteor,” a 1-to-100 scale of these things in which 40 means you’re getting the general idea, and 70 is as good as most human translators, Google gets in the 50s on the Arabic-English pairing, says Alon Lavie, president of the Association for Machine Translation in the Americas. “Far better than gist. Pretty damn good. They’re the 800-pound gorilla.”

Google wants to own speech. Whenever you call 1-800-Goog-411 and say “pizza,” you are teaching their computers to associate the way you say that word with its text version, Mike Cohen of Google told Technology Review.

Using those smarts, in November, Google unveiled an app to search on any topic you can imagine by talking into your iPhone. Automatically and relentlessly, day and night, that feature provides even more real-world training for their voice-recognition bots.

When all this becomes a routine part of Google’s Android mobile software, how big a deal will it be to culture and society to have a cell phone that will allow you to talk to most of the world’s 6 billion people?

“In some ways I am more enthusiastic about the text part” of translation, Norvig says. “I think that opens up a lot. If you’re a speaker of a minority language — say, Arabic — how much of the Web is accessible to you? Well, it’s really a small portion of 1 percent or so. But if we can now translate those Web pages, now all of a sudden the whole world opens up to you. It’s a lot more information and it’s also different worldviews.”

Common language

The world’s common language is not English, it’s broken English, says Alex Waibel, of Carnegie Mellon, a DARPA principal investigator, born in Germany, who spends his life in international conferences where English is everybody’s second or fourth language. Eighty percent machine accuracy is better than some very large portion of these alleged English speakers, he says.

“Human translators aren’t actually that great,” Waibel says. In one study, people listened to a machine interpreter and then were asked questions to measure their grasp of content. The score was 64 on a 100-point scale. Not wonderful. But when they did the same test with a human simultaneous interpreter, the result was not a lot better — a 74.

“When humans try to figure out how to translate one thing, they drop their attention as to what’s coming in the next graph,” Waibel says. “And they’re human. They get tired. They get bored.”

A good machine can really lubricate human connection, Waibel reports. When global researchers hit the town after a conference in Japan, they plopped one of his translators down in the middle of the table. A grand old sake-fueled time was had, as they communicated in ways beyond the unaided capabilities of any of them.

But can all our cleverness re-create a Genesis world, in which “the people is one, and they have all one language … and now nothing will be restrained from them, which they have imagined to do”?

There’s that nagging problem of how much more clever are our flesh-and-blood than our creations. Waibel recalls his family visiting Australia. His 2-year-old son, Joshua, looked out from the hotel lobby at a creature loping across the lawn.

“Kangaroo!” he said.

Waibel’s eyes go wide at the very scope of this accomplishment. He’s devoted his life to figuring out how to allow machines to make connections among words.

How do you replicate the way a toddler accurately and instantly makes the connection between some cartoon he’d glanced at months ago and an utterly novel real-world situation?

How did his little brain do that?

Marketplace