ASR Machines That Know When They Do Not Know

Goal: Development of ASR systems that can successfully deal with new, unexpected data (“systems that know when they do not know” or “getting rid of unknown unknowns”)

To constrain the problem and provide resources for team members with different ideas, the core problem is stated as: Given a classifier that yields a frame-based vector of posterior probabilities for speech sounds of interest, predict the accuracy of these estimates without knowing the correct probabilities on test data but knowing performance of the classifier on the training data.

The main attack on this problem will be through multi-stream processing, where many parallel and partially redundant processing streams are derived from information providing data. This approach should be effective in many practical situations where the unexpected signal distortions negatively affect only some of the processing streams while the remaining streams can still be used for the extraction of the targeted information. The technique needs to be unsupervised, since the ground truth on the unknown data is not known, and fast, since new unexpected data need to be dealt with.

To date, research at JHU has resulted in formation of band-limited artificial neural net based processing streams for recognition of noisy speech, and in a couple of techniques for estimating the classifier performance based on temporal dynamics of classifier outputs. JHU will provide its multistream experimental system with 31 processing streams based on independent artificial neural net classifiers. Initial results on recognition of noise-corrupted TIMIT have been already obtained and will serve as a baseline. We will also provide the true accuracies for all processing streams, which would serve as the ideal targets of our efforts.


Team Members
Team Leader
Hynek HermanskyJohns Hopkins University
Senior Members
Lukas BurgetBrno University of Technology
Jordan CohenSpelamode Consulting
Naomi FeldmanUniversity of Maryland
Tetsuji OgawaWaseda University
Richard RoseMcGill University
Richard SternCarnegie Mellon University
Graduate Students
Matthew MaciejewskiCarnegie Mellon University
Harish MallidiJohns Hopkins University
Anjali MenonCarnegie Mellon University
Vijayaditya PeddintiJohns Hopkins University
Matthew WiesnerMcGill University
Affiliate Members
Eleanor ChodroffJohns Hopkins University
Emmanuel DupouxLaboratoire de Science Cognitive et Psycholinguistique
John GodfreyJohns Hopkins University
Sanjeev KhudanpurJohns Hopkins University

Center for Language and Speech Processing