CLSP Researchers Among Winners of IARPA ASpIRE Challenge

September 11, 2015

Researchers from the Center for Language and Speech Processing at Johns Hopkins University are among the winners for the Intelligence Advanced Research Projects Activity’s (IARPA) Automatic Speech Recognition in Reverberant Environments (ASpIRE) challenge. The winning teams from the Johns Hopkins University, Raytheon BBN Technologies, the Institute for Infocomm Research, and Brno University of Technology will share $110,000 in prizes.

Typically, speech recognition systems are ‘trained’ on speech recorded in environments very similar to the environments in which they are expected to be used. The ASpIRE challenge contestants tackled a harder problem: building accurate systems for automatically transcribing speech recorded in noisy and reverberant environments without knowing anything about the recording devices or the acoustics of the space, and without training data that resembled the contest’s test conditions. At the start of the challenge, contestants were given a telephone speech to develop and train their systems over a period of roughly three months. Their systems were tested on very different speech recordings collected in noisy rooms with various sizes, shapes and microphone configurations. The ASpIRE challenge was uniquely challenging because of this kind of mismatch between training data and test data.

Challenge entries were scored under two evaluation conditions: the single microphone condition and the multiple microphone condition. Vijayaditya Peddinti, Guoguo Chen, Dr. Daniel Povey, and Dr. Sanjeev Khudanpur won in the single microphone category, which tested accuracy of speech recognition on recordings from single microphones selected arbitrarily from among six microphones placed in the room.

All of the ASpIRE challenge winners delivered systems with more than a 50% reduction in word error rate (WER) compared to the IARPA baseline system. WER is the standard measure of accuracy for speech recognition systems; lower WER scores indicate more accurate systems.

The speech data were collected by Linguistic Data Consortium. Appen Butler Hill transcribed the microphone recordings. MIT Lincoln Laboratory and IARPA together evaluated results. InnoCentive managed the challenge website including maintaining a leaderboard.

Visit the IARPA website for the full list of winners.

 

Johns Hopkins University

Johns Hopkins University, Whiting School of Engineering

Center for Language and Speech Processing
Hackerman 226
3400 North Charles Street, Baltimore, MD 21218-2680

Center for Language and Speech Processing