A Detection Approach to Speech Recognition and Understanding

Chin Lee of Bell Laboratories
at the CLSP/JHU Summer Research Workshop on July 22, 1998 at 10:30 am, Arellano Theater, Levering Hall.
A Detection Approach to Speech Recognition and Understanding


Automatic speech recognition is often accomplished by decoding with a well-defined language model, typically an N-gram. However, for many problems we encounter in real-life, a complete characterization of the task grammar is often unavailable due to the lack of training data. This results in a degradation of speech recognition performance, especially when the input utterances are generated spontaneously as in the case of most real-life dialogue tasks.

In this talk, we discuss a new paradigm based on a detection approach. Instead of trying to characterize the grammar entirely, we construct a family of detectors, each one is designed to locate specific speech events of interest. These events, such as keywords and key phrases, are related to the task at a higher level and can be specified even without any training data. The speech uttereance is first processed by the collection of detectors to produce a lattice. The event lattice is then parsed by incorporating more speech and language knowledge to generate multiple answers. They are then rescored to produce the recognized sentence. When the events are labeled with semantic attributes, the recognized sentence can also be "understood" to generate corresponding semantic actions. We report on experimental results with two real-life tasks and conclude that the proposed detection approach is more robust than the conventional decoding approach especially in dealing with ill-formed inputs.

Chin-Hui Lee received the B.S. degree in Electrical Engineering from National Taiwan University, Taipei, in 1973, the M.S. degree in Engineering and Applied Science from Yale University, New Haven, in 1977, and the Ph.D. degree in Electrical Engineering with a minor in Statistics from University of Washington, Seattle, in 1981. In 1981, Dr. Lee joined Verbex Corporation, Bedford, MA, and was involved in research work on connected word recognition. In 1984, he became affiliated with Digital Sound Corporation, Santa Barbara, where he engaged in research and product development in speech coding, speech synthesis, speech recognition and signal processing for the development of the DSC-2000 Voice Server. Since 1986, he has been with Bell Laboratories, Murray Hill, New Jersey, where he is now a Distinguished Member of Technical Staff and Head of the newly established Dialogue Systems Research Department. His current research include multimedia signal processing, speech and speaker recognition, speech and language modeling, adaptive and discriminative learning, spoken dialogue processing, biometric authentication and information retrieval. His research scope is reflected in a recent book, entitled "Automatic Speech and Speaker Recognition: Advanced Topics", published by Kluwer Academic Publishers in 1996. Dr. Lee has participated actively in both domestic and international professional societies. He is a member of the IEEE Signal Processing Society, Communication Society, and the European Speech Communication Association. He is also a lifetime member of the Computational Linguistic Society in Taiwan. >From 1991 to 1995, he was an associate editor for the IEEE Transactions on Signal Processing and Transactions on Speech and Audio Processing. During the same period, he was also a member of the ARPA Spoken Language Coordination Committee. Since 1995 he has been a member of the Speech Technical Committee of the IEEE Signal Processing Society (SPS). In 1996 he helped promote the newly established SPS Multimedia Signal Processing (MMSP) Technical Committee and is a member of the MMSP-TC. Due to his continuous contributions to speech and language processing in Taiwan, he was recently appointed as a member of the Advisory Board of Institute of Information Science at Academia Sinica, Taipei, Taiwan. Dr. Lee is a Fellow of the IEEE, and serves as the Chairman of the SPS Speech Processing Technical Committee. He has published over 170 papers and 16 patents on the subject of automatic speech and speaker recognition. He received the SPS Senior Award in 1994 and the SPS Best Paper Award in 1997. His inventions have been widely used in Lucent's products and services deployed over the telecommunication networks. Recently he was awarded the prestigious Bell Labs President's Gold Award for his contributions to the Lucent Speech Processing Solutions product.
 

Click Here for His Presentation