Defense News logo

May 28, 2007

New Translation Technology To Aid U.S. Forces

By WILLIAM H. McMICHAEL

The room is quiet, save for a few words spoken into the microphone.

“Are you aware of any terrorist activity around here?”

A flat, disembodied voice from a laptop computer immediately repeats the English phrase, and then, just as quickly, repeats the question in Iraqi Arabic.

A live role-player hears it, replies in Arabic, and in just over two seconds, the disembodied voice repeats his answer — in perfect English.

“No, we do not have terrorists, but Kasim knows one of them.”

The system — BBN Technologies’ Foreign Language Conversation Translation Device — represents the wave of the future for troops in foreign lands who don’t have human translators at their sides.

It’s not there yet. Nor are the other four devices being evaluated for the U.S. Defense Advanced Research Projects Agency (DARPA) by the National Institute of Standards and Technology (NIST). But the five devices have taken huge leaps toward a point long thought impossible to reach: “free-form, two-way dialogue in tactically relevant environments,” as DARPA puts it.

Basic one-way translators are common in Iraq, where U.S. Joint Forces Command has fielded 3,000 devices and the Army’s Rapid Equipping Force has sent another 1,000 or so, to name a couple of sources. These devices, such as VoxTec’s Phraselater, can translate simple yet mission-dependent English phrases such as “how many of you are there?” into Arabic.

But their vocabularies are limited — and they can’t understand the answer.

The BBN device, like the other four being tested for DARPA, is far more sophisticated. For instance, “kasim” also means “divided” in Arabic. But the BBN translator understood the context in which the word was being used and recognized it as a name. It also turned the question, “Are you married?” into “How many wives do you have?” — a more culturally astute phrasing in Iraq, said Premkumar Natarajan, the BBN device’s project director.

Like the others, the BBN device has limitations. Despite the use of a directional microphone and advanced software, the system’s performance would significantly degrade in a high-stress situation where normal tones become frantic shouts and background noise is high. But the capability to operate in those situations is coming, Natarajan said.

The BBN device also maintains a digital log of the conversation, as well as a searchable database, he said, “so you can easily mine through for all kinds of information, across sessions. ... ‘Oh, Kasim’s gone to al-Hella. I’ve heard this al-Hella word in the last few days. I want to search who else has gone to al-Hella.”

And because everyone’s voice is unique, the recordings could serve another useful function: as legal proof — as solid as a fingerprint — of someone’s identity, Natarajan said.

The technological advances, very basically, are due to a move away from “teaching” a machine to memorize recorded words or phrases to digitizing words — turning them into wave forms — and running those through algorithms that translate what’s being said in each language into synthesized speech.

DARPA is supporting the work of BBN, SRI International, IBM, VoxTech, Integrated Wave Technologies and Sehda.

NIST, however, is testing only five specific devices, made by IBM, SRI, Sehda, BBN and Carnegie Mellon University.

The first tests on the devices, all in laptop form, were held at NIST’s Gaithersburg, Md., campus. Three progressively noisier evaluations started in a lab and ended in a structured, multiscenario field environment, said Craig Schlenoff, a robotics researcher and project manager at NIST.

DARPA and NIST won’t comment on the performance of the individual systems, but DARPA says all perform in the 70 percent to 80 percent accuracy range. The long-term goal, said DARPA spokeswoman Jan Walker, is two-way translations across all subjects with 100 percent accuracy, with background noise, dialects and accents taken into account.

In the short term — the next three to five years, she said — DARPA wants 80 percent to 90 percent accuracy for specific task-related phrases.

“We are optimistic that in the near future, we will be able to deliver a device that will be able to translate successfully 80 to 90 percent of the time when speakers articulate carefully and stick to specific subject areas during the conversation,” said Mari Maeda, program manager for DARPA’s Translation Systems for Tactical Use program.

DARPA also wants to move away from laptops and toward a much smaller, hands-free device. And that’s the requirement for the next round of NIST tests, slated for July.

Developers have free rein in terms of the form of the device, but their use cannot require looking at a laptop or employing a keyboard, Schlenoff said.

Schlenoff’s team took the devices to the Joint Readiness Training Center at Fort Polk, La., to let troops in the field test them.

“They say, ‘We see a lot of progress, but we’re not ready to take it the battlefield,’” Schlenoff said. “But the capability of the technology is clearly getting better over time.”