Dynamic Segmental Models of Speech Coarticulation

Research Group of the 1998 Summer Workshop

Automatic speech recognition has achieved significant success by using powerful and complex models for representing and interpreting the speech (acoustic) signal. However these models require unreasonable amounts of training data. Some researchers think that the nature and fundamental philosophy of the current acoustic-phonetic modelling methods, such as hidden Markov models, are inappropriate. Participants in this project plan to explore a different way of thinking of the nature of speech patterns. Their proposed model has a long history in speech science, but it has yet to be successfully applied to automatic speech recognition. The speech signal can be thought of as being generated by a relatively low dimensional system, namely our articulatory organs, moving slowly relative to the variations of the signal picked up by a microphone. The proposed computational model consists of a linear dynamical process describing smooth movement of the vocal tract resonance, which flows from one phonetic unit to another, with the observed features of the acoustic signal being a nonlinear function of this process. Vocal tract resonance is a characteristic of the vocal tract that is related to the familiar notion of formants; it corresponds roughly to the formants for vocalic sounds and though it may not correspond to spectral peaks for consonants, it changes smoothly through them as the configuration of the articulators changes. The participating researchers expect that this model will be robust even for modest amounts of training data due to its compactness. Computational techniques they plan to use in this project include nonlinear regression, multilayer perceptrons and Kalman filtering.

Team Members
Senior Members
John Bridle	Dragon UK
Li Deng	Waterloo
Joe Picone	Miss. State
Hywel Richards	Dragon UK
Mike Schuster	Nara, Japan
Graduate Students
Terri Kamm	CLSP
Jeff Ma	Waterloo
Undergraduate Students
Sandi Pike	Brown
Roland Reagan	CMU

Dynamic Segmental Models of Speech Coarticulation

Upcoming Seminars

Center for Language and Speech Processing