Measuring and Using Speech Production Information – Shri Narayanan (Viterbi School of Engineering/University of Southern California)

March 13, 2012 all-day

View Seminar Video
The human speech signal carries crucial information not only about communication intent but also affect, and emotions. From a basic scientific perspective, understanding how such rich information is encoded in human speech can shed light on the underlying communication mechanisms. From a technological perspective, finding ways for automatically processing and decoding this complex information in speech continues to be of interest for a variety of applications. One line of work in this realm aims to connect these perspectives by creating technological advances to obtain insights about basic speech communication mechanisms and in utilizing direct information about human speech production to inform technology development. Both these engineering problems will be considered in this talk.A longstanding challenge in speech production research has been the ability to examine real-time changes in the shaping of the vocal tract; a goal that has been furthered by imaging techniques such as ultrasound, movement tracking and magnetic resonance imaging. The spatial and temporal resolution afforded by these techniques, however, has limited the scope of the investigations that could be carried out. In this talk, we will highlight recent advances that allow us to perform near real-time investigations on the dynamics of vocal tract shaping during speech. We will also use examples from recent and ongoing research to describe some of the methods and outcomes of processing such data, especially toward facilitating linguistic analysis and modeling, and speech technology development. [Work supported by NIH, ONR, and NSF].
Shrikanth (Shri) Narayanan is Andrew J. Viterbi Professor of Engineering at the University of Southern California (USC), where he holds appointments as Professor of Electrical Engineering, Computer Science, Linguistics and Psychology, and as Director of the USC Ming Hsieh Institute. Prior to USC he was with AT&T Bell Labs and AT&T Research. His research focuses on human-centered information processing and communication technologies. He is a Fellow of the Acoustical Society of America, IEEE, and the American Association for the Advancement of Science (AAAS). He is also an Editor for the Computer Speech and Language and an Associate Editor for the IEEE Transactions on Multimedia, IEEE Transactions on Affective Computing, APSIPA Transactions on Signal and Information Processing and the Journal of the Acoustical Society of America. He is a recipient of several honors including Best Paper awards from the IEEE Signal Processing society in 2005 (with Alex Potamianos) and in 2009 (with Chul Min Lee) and selection as a Distinguished Lecturer for the IEEE Signal Processing society for 2010-11. He has published over 475 papers, and has twelve granted US patents.

Center for Language and Speech Processing