New Machine Learning Approaches to Speech Recognition – Alex Acero (Microsoft)

July 22, 2010 all-day

In this talk I will describe some new approaches to speech recognition that leverage large amounts of data using techniques from information retrieval and machine learning.Speech recognition has been proposed for many years as a natural way to interact with computer systems, but this has taken longer than originally expected. At first dictation was thought to be the killer speech app, and researchers were waiting for Moore’s law to get the required computational power in low-cost PCs. Then it turned out that, due to social and cognitive reasons, many users do not want to dictate to their computers even if it’s inexpensive and technically feasible. Speech interfaces that users like require not only a sufficiently accurate speech recognition engine, but also many other less well known factors. These include a scenario where speech is the preferred modality, proper end-to-end design, error correction, robust dialog, ease of authoring for non-speech experts, data-driven grammars, a positive feedback loop with users, and robustness to noise. In this talk I will describe some of the work we have done in MSR on building natural user interfaces using speech technology, and will illustrate it with a few scenarios in gaming (Xbox Kinnect), speech in automotive environments (Hyundai/Kia UVO, For SYNC), etc.
Alex Acero is Research Area Manager in Microsoft Research, directing an organization with 60 engineers working on audio, multimedia, communication, speech, and natural language. He is also an affiliate Professor of Electrical Engineering at the University of Washington. He received a M.S. degree from the Polytechnic University of Madrid, Madrid, Spain, in 1985, a M.S. degree from Rice University, Houston, TX, in 1987, and a Ph.D. degree from Carnegie Mellon University, Pittsburgh, PA, in 1990, all in Electrical Engineering. Dr. Acero worked in Apple Computer’s Advanced Technology Group during 1990-1991. In 1992, he joined Telefonica I+D, Madrid, Spain, as Manager of the speech technology group. Since 1994 he has been with Microsoft Research.Dr. Acero is a Fellow of IEEE. He has served the IEEE Signal Processing Society as Vice President Technical Directions (2007-2009), Director Industrial Relations (2009-2011), 2006 Distinguished Lecturer, member of the Board of Governors (2004-2005), Associate Editor for IEEE Signal Processing Letters (2003-2005) and IEEE Transactions of Audio, Speech and Language Processing (2005-2007), and member of the editorial board of IEEE Journal of Selected Topics in Signal Processing (2006-2008) and IEEE Signal Processing Magazine (2008-2010). He also served as member (1996-2000) and Chair (2000-2002) of the Speech Technical Committee of the IEEE Signal Processing Society. He was Publications Chair of ICASSP98, Sponsorship Chair of the 1999 IEEE Workshop on Automatic Speech Recognition and Understanding, and General Co-Chair of the 2001 IEEE Workshop on Automatic Speech Recognition and Understanding. Dr. Acero served as member of the editorial board of Computer Speech and Language and member of Carnegie Mellon University Dean’s Leadership Council for College of Engineering. Dr. Acero is author of the books “Acoustical and Environmental Robustness in Automatic Speech Recognition” (Kluwer, 1993) and “Spoken Language Processing” (Prentice Hall, 2001), has written invited chapters in 4 edited books and 200 technical papers. He holds 78 US patents. Since 2004, Dr. Acero, along with co-authors Drs. Huang and Hon, has been using proceeds from their textbook “Spoken Language Processing” to fund the “IEEE Spoken Language Processing Student Travel Grant” for the best ICASSP student papers in the speech area.

Center for Language and Speech Processing