Imagine having access to the all of the world’s recorded conversations, videos that people have posted to YouTube, in addition to chatter collected by random microphones in public places. Then picture the possibility of searching that dataset for clues related to terms that you are interested in the same way you search Google. You could look up, for example, who was having a conversation right now about plastic explosives, about a particular flight departing from Islamabad, about Islamic State leader Abu Bakr al-Baghdadi in reference to a particular area of northern Iraq.
On Nov. 17, the U.S. announced a new challenge called Automatic Speech recognition in Reverberant Environments, giving it the acronym ASpIRE. The challenge comes from the Office of the Director of National Intelligence, or ODNI, and the Intelligence Advanced Research Projects Agency, or IARPA. It speaks to a major opportunity for intelligence collection in the years ahead, teaching machines to scan the ever-expanding world of recorded speech. To do that, researchers will need to take a decades’ old technology, computerized speech recognition, and re-invent it from scratch.
Importantly, the ASpIRE challenge is only the most recent government research program aimed at modernizing speech recognition for intelligence gathering. The so-called Babel program from IARPA, as well as such DARPA programs as RATS (Robust Automatic Transcription of Speech), BOLT (Broad Operational Language Translation) and others have all had similar or related objectives.
To understand what the future of speech recognition looks like, and why it doesn’t yet work the way the intelligence community wants it to, it first becomes necessary to know what it is. In a 2013 paper titled “What’s Wrong With Speech Recognition” researcher Nelson Morgan defines it as “the science of recovering words from an acoustic signal meant to convey those words to a human listener.” It’s different from speaker recognition, or matching a voiceprint to a single individual, but the two are related.
Speech recognition is focused more precisely on getting a machine to understand speech well enough to instantly transcribe spoken words into text or usable data. Anyone that’s ever used a program like Dragon Naturally Speaking might think that this is a largely solved problem. But most automatic transcribing programs are actually only useful in very few situations, which limits their effectiveness in terms of intelligence collection.
It seems like an easy challenge for a military in the process of outfitting robotic boats with lasers, but speech recognition, especially in diverse environments, is incredibly difficult despite decades of steady research and funding.
Read the Full Article: Source – Defenseone
http://www.defenseone.com/technology/2014/12/what-happens-when-spies-can-eavesdrop-any-conversation/100142/
Leave a Reply
You must be logged in to post a comment.