Learning from Speech Production for Improved Recognition – Karen Livescu (TTI Chicago)

April 5, 2013 all-day

Speech production has motivated several lines of work in the speech recognition research community, including using articulator positions predicted from acoustics as additional observations and using discrete articulatory features as lexical units instead of or in addition to phones. Unfortunately, our understanding of speech production is still quite limited, and articulatory data is scarce. How can we take advantage of the intuitive usefulness of speech production, without relying too much on noisy information? This talk will cover recent work exploring several ideas in this area, with the theme of using machine learning to automatically infer information where our knowledge and data are lacking. The talk will include work on deriving new acoustic features using articulatory data in a multi-view learning setting, as well as lexical access and spoken term detection using hidden articulatory features.
Karen Livescu is an Assistant Professor at TTI-Chicago, where she has been since 2008. She completed her PhD in 2005 at MIT in the Spoken Language Systems group of the Computer Science and Artificial Intelligence Laboratory. In 2005-2007 she was a post-doctoral lecturer
in the MIT EECS department. Karen’s main research interests are in speech and language processing, with a slant toward combining machine learning with knowledge from linguistics and speech science. She is a member of the IEEE Spoken Language Technical Committee and has organized or co-organized a number of recent workshops, including the ISCA SIGML workshops on Machine Learning in Speech and Language Processing and Illinois Speech Day. She is co-organizing the upcoming Midwest Speech and Language Days and the Interspeech 2013 Workshop on Speech Production in Automatic Speech Recognition.

Center for Language and Speech Processing