Acoustic Processing/Modeling Group: Exploring the Time Dimension at Different Scales

Research Group of the 1997 Summer Workshop

In the 1997 JHU/CLSP workshop (WS97) our group revisits the acoustic processor architecture employed in the state of the art, large vocabulary, continuous speech recognition systems. We investigate data driven processing paradigms exploring techniques at different context scales. At the short time scales (~10ms) we investigate the non-linear frequency mapping known as Mel-scale. At the medium time scales, (context ~ 100ms) we investigate linear discriminant and heteorscedastic discriminant transforms. At time scales with longer context (~1000ms) we explore feature-trajectory filtering. At even longer time scales (~ 500 ms to 4s) we experiment with adaptive Cepstrum bias normalization techniques.

The results of our investigation are very encouraging and are summarized in the online final reports and papers.

Team Members
Senior Members
Andreas Andreou	CLSP
Hynek Hermansky	CLSP
Juergen Luettin	IDIAP
Yasuhiro Minami	NTT Human Interface Labs
Christian Wellekens	Eurecom
Graduate Students
Terri Kamm	CLSP
Daniel Fain	CalTech

Acoustic Processing/Modeling Group: Exploring the Time Dimension at Different Scales

Upcoming Seminars

Center for Language and Speech Processing