Progress in speaker adaptation and acoustic modeling for LVCSR – George Saon (IBM)

October 4, 2005 all-day

View Seminar Video
This talk is organized in two parts. In the first part, we discuss a non-linear feature space transformation for speaker/environment adaptation which forces the individual dimensions of the acoustic data to be Gaussian distributed. The transformation is given by the preimage under the Gaussian cumulative distribution function CDF of the empirical CDF for each dimension. In the second part, we review some existing techniques for precision matrix modeling such as EMLLT and SPAM and we describe our recent work on discriminative training of full covariance Gaussians on the 2300 hours EARS dataset.

Johns Hopkins University

Johns Hopkins University, Whiting School of Engineering

Center for Language and Speech Processing
Hackerman 226
3400 North Charles Street, Baltimore, MD 21218-2680

Center for Language and Speech Processing