NEW YORK— IBM unveiled new speech recognition technology on Tuesday that can comprehend the nuances of spoken English, translate it on the fly, and even create on-the-fly subtitles for foreign-language television programs.
Historically, speech technology required the user to limit his speech to a fixed set of phrases in order to interact with a device. With IBMs Embedded ViaVoice 4.4 software package, introduced on Tuesday, the company aims to allow users to speak commands using phrasing that is natural to them.
In a demonstration today at IBMs headquarters here, for example, users changed a simulated radio station by speaking any of the following phrases: “Play 92.3,” “Tune to 92.3” or “Tune the radio to 92.3.”
Though speech recognition is already built into products like Microsofts OfficeXP, many users still prefer to use their keyboards.
Speech recognition can be trained to recognize a particular users voice. But interpreting sounds from a variety of speakers can be even more challenging, unless a limited library of sounds, or phonemes, is used.
Still, though speech recognition by a computer is still far from perfect, the future is bright, according to David Nahamoo, a manager in the human language technologies department at IBM Research.
“At IBM, we have this superhuman speech recognition [initiative in which] the goal is to get performance comparable to humans in the next five years,” Nahamoo said.