Fall 2001: CLSP Seminar Series
Fall 2001: CLSP Seminar Series Tuesday, May 13, 2008
CLSP Homepage Search CLSP Current Events @ CLSP

Detection and Segmentation of phonemes in real time for Lip-Synch application

Hanseok Ko - November 6th, 2001

Korea University and Johns Hopkins University

Presentation Slides: N/A


This seminar will address a real time method to detect and segment the voiced speech for phoneme recognition focused to lip-synch application, wherein the Avatar's lip motion is activated in synch by the recognized phonemes. The talk will mainly focus on how well the voiced detection and segmentation can be realized in real time. One of the major difficulties in automatic speech recognition is the extreme variability of the speech signal at the acoustic-phonetic level and across speakers. In the case of the variation between speakers, pattern recognition approaches such as HMM can handle this variability by training driven with a large training data. However, when it comes to that of acoustic-phonetic level, it is still difficult to achieve robust phone segmentation in real time due to the inability to capture/model variability and lack of prior knowledge (or trained references). This is the reason why the performance of real time phoneme recognition doesn't measure up to that of conventional HMM-based ASR. Since our focus is on lip-synch application, the Avatar's lip's motion must be activated naturally in synch with the recognized phonemes to the point that the audience cannot distinguish the apparent delay. Fortunately, in the case of lip-synch application, the distinguishable lip motion mainly depends on not all existing phonemes but the phonemes of vowels and some voiced consonants. Consequently, unlike all-phoneme recognition, if the phone detection and segmentation in the voiced speech can be reliably made, we can expect improved performance. In this seminar, various techniques of achieving the voiced detection and segmentation will be discussed and if time permits an on-going result will be demonstrated.

Biographical Information

For more information, please see his web page.

Seminar Schedule


The Center for Language and Speech Processing
The Johns Hopkins University
3400 North Charles Street, Barton Hall
Baltimore, MD 21218
*Telephone: (410) 516-4237 *Fax: (410) 516-5050 *E-mail: clsp@clsp.jhu.edu