Carnegie Mellon team’s electronic translator requires only a set of lips

A new device could allow users to say something in another language without actually having to speak the language — or speak at all for that matter.

A team of researchers at the Language Technologies Institute (LTI) has developed a prototype for a device that translates what a user is saying, but does not require the user to speak the words out loud.

Instead, the device relies on electrodes attached to the muscles on the face and neck to detect small electrical currents that are produced whenever someone moves his or her lips to speak.

These electrical currents form typical patterns when the user is speaking certain words. A computer chip in the device monitors these patterns and recognizes the words.

The text is then translated to a different language, and the device speaks the translation using a synthesis program developed by the company Cepstral.

Currently, the prototype supports English to Spanish or German and Chinese to English.

But a new development in October shifted the focus of the research from monitoring standard patterns of whole words to monitoring specific phonemes, the smaller units of sound that compose different words. This makes the device more flexible because it allows the device to link together the sounds to form many more words.

So far, the prototype only has about a hundred words in its vocabulary, allowing users to create 30 to 40 sentences. But the average person can create more than 40 sentences on any given day. This presents a challenge for the translator to remain stable and accurate.

“The more units you allow in your search, the higher the confusability. The higher the confusability, the higher the chances that you can confuse words among each other,” said research scientist Tanja Schultz.

“We haven’t really fully explored this technology for a large vocabulary. We don’t know what the performance would be if we do this.”

Although there are other translators available on the market, they don’t function like the team’s translator does. Current translators use microphones and recognition software to “listen” to what the user is saying. As a result, they don’t work as well when there is a lot of background noise, especially in crowded environments such as airports or emergency rooms. The translation is less accurate when the speech is unclear.

But because the new translator functions even when the user is silent, it would reduce situations where this could occur, and it is much easier for the user to make sure that only what he or she is saying is actually being translated. In any case, an accurate translation is vital.

“The other intriguing thing about the prototype is that you can mouth words,” Schultz said. “For everybody who’s doing speech recognition and who knows how challenging it is to get speech recognition to work in these noisy environments, I think that’s another very neat aspect.”

While the translators currently on the market allow a user to review the recognized text on a screen before they speak the translation, the prototype aims to speak the translations instantaneously.

The user could check the translation before it goes out to the listener by having the translation run in the background, but it would require a longer delay to respond and it would be distracting to check the translation and speak at the same time. Researchers have to ensure the device is accurate enough that users can be confident of the translation without needing to review it.

A silent translator would also be useful in situations that require someone to be quiet, such as a meeting. In one of the demonstrations of the device, the user answers the phone and responds to the another person without causing a disturbance because he “spoke” silently, letting the device speak for him.

Although the device has a lot of potential uses, implementation of the device for the broader public could be more difficult. The prototype requires wired electrodes to be attached to the face and throat, making the unit difficult to carry around.
However, Schultz believes this might not be such an issue in the future if researchers can develop wireless electrodes that could be inserted into the face.

“People do all sorts of piercing and tattooing, and so maybe at some point we can, for example, inject little pieces into our cheek,” said Schultz. “Who knows what’s coming?”

People might not welcome the idea right now, but these minor enhancements could end up being very persuasive, especially because they would enable people to communicate effectively in multiple languages. By building technology that powers translation, researchers could help eliminate the differences between the thousands of languages in the world.

“It would really allow us to tear down the language borders,” said Schultz.