Jinyu Li (Microsoft) “Deep Learning Acoustic Model in Microsoft Cortana Voice Assistant”
3400 N Charles St
Baltimore, MD 21218
USA
Abstract
Deep Learning Acoustic Modeling has been widely deployed to real-world speech recognition products and services that benefit millions of users. In this talk, I will first briefly describe selected developments and investigations at Microsoft to make deep learning networks more effective under production environment, with the focus on computational cost reduction, knowledge transfer and model robustness. Then, I will introduce our recent efforts in adapting models with unlabeled data and the work of separating speeches from multi-speakers.
Biography
Jinyu Li received the Ph.D. degree from Georgia Institute of Technology, Atlanta, U.S. From 2000 to 2003, he was a Researcher in the Intel China Research Center and Research Manager in iFlytek Speech, China. He has been with Microsoft since 2008. Currently, he is a Principal Applied Scientist, working as a technical Lead to design and improve speech modeling algorithms and technologies that ensure industry state-of-the-art speech recognition accuracy for Microsoft products such as Cortana voice assistant. His major research interests cover several topics in speech recognition, including deep learning and noise robustness. He authored more than 70 refereed publications and 20 patents. He is the leading author of the book “Robust Automatic Speech Recognition — A Bridge to Practical Applications”, Academic Press, Oct, 2015. Currently, he serves as the associate editor of IEEE/ACM Transactions on Audio, Speech and Language Processing.