CLSP Homepage : Workshop Homepage
Workshop 2002
August 5, 2002 Guest Lecture Sunday, July 20, 2008


Jump To:

The WS02 Guest Lectures are open to the public, and a video recording of the lecture will be archived in the CLSP library. Talks are held every Wednesday and on selected other dates throughout the course of the workshop. All lectures will be given on the Homewood Campus of Johns Hopkins University.

Click here for directions to JHU.



Guest Lecture Information
The manifold advantages of articulatory representations, including microphone and speaker normalization: John Hogden - 08/05/2002
  • Location: Shaffer Hall, Room 101

  • Time: 2:30 pm - 3:30 pm

  • Abstract:

    A new acoustic model, Maximum Likelihood Continuity Mapping (MALCOM), will be presented. MALCOM generates a stochastic model of speech assuming 1) that speech sounds are periodically emitted as a point moves smoothly through a low-dimensional space called a continuity map (CM), and 2) that the sound emitted at time t is probabilistic function of the position of the point at time t. The assumptions underlying MALCOM are intended to mimic speech production in that 1) speech sounds are produced as the articulators move slowly through a low-dimensional articulator space, and 2) the speech sound produced at time t is a function of the articulator positions at time t. MALCOM's smoothness constraint implies that MALCOM uses much more temporal context than typical Markov models.

    The parameters required by MALCOM constitute an estimate of the mapping between articulation and acoustics. Surprisingly, no articulator measurements are required for training. To make the point that MALCOM is able to invert very general nonlinear functions, including microphone nonlinearities and speaker differences, to recover articulator positions, we will discuss a mathematical proof, simulation results, and experimental evidence using simultaneously collected acoustic and articulator measurements. We conclude that the articulator positions recovered by MALCOM should provide a better basis for speech recognition than mel-cepstra (or other commonly used acoustic parameters) in that they are relatively invariant to microphone effects, speaker differences, and can convey the same information using fewer dimensions - suggesting that they will be less affected by acoustic noise. MALCOM should also be applicable to characterizing speaker differences, and so may be useful for speaker recognition.

     




The Center for Language and Speech Processing
The Johns Hopkins University
3400 North Charles Street, Barton Hall
Baltimore, MD 21218
*Telephone: (410) 516-4237 *Fax: (410) 516-5050 *E-mail: clsp@clsp.jhu.edu