Progress in speaker adaptation and acoustic modeling for LVCSR – George Saon (IBM)
View Seminar Video
This talk is organized in two parts. In the first part, we discuss a non-linear feature space transformation for speaker/environment adaptation which forces the individual dimensions of the acoustic data to be Gaussian distributed. The transformation is given by the preimage under the Gaussian cumulative distribution function CDF of the empirical CDF for each dimension. In the second part, we review some existing techniques for precision matrix modeling such as EMLLT and SPAM and we describe our recent work on discriminative training of full covariance Gaussians on the 2300 hours EARS dataset.