Recent Progress in Acoustic Speaker and Language Recognition – Alan McCree (Johns Hopkins Human Language Technology Center of Excellence)
View Seminar Video
In this talk, I give an overview of recent progress in the fields of speaker and language recognition, with emphasis on our current work at the JHU HLTCOE. After a brief review of modern GMM subspace methods, in particular i-vectors, I will present approaches for pattern classification using these features, with an emphasis on simple Gaussian probabilistic models. For language recognition, these are quite effective, but our recent work has shown that discriminative training can improve performance. As a bonus, this also provides meaningful probability outputs without requiring a separate calibration process. For speaker recognition, on the other hand, classification is more difficult due to the limited enrollment data per speaker, and Bayesian methods have been successful. I will discuss a number of such methods, including the popular PLDA approach. Finally, I’ll describe our recent successes in adapting these Gaussian parameters to new domains when labeled training data is not available.
Alan McCree is a Principal Research Scientist at the JHU HLTCOE, where his primary interest is in the theory and application of speaker and language recognition. His research in speech and signal processing at the COE, and previously at MIT Lincoln Laboratory, Texas Instruments, AT&T Bell Laboratories, and Linkabit, has found applications in international speech coding standards, digital answering machines, talking toys, and cellular telephones. He has an extensive publication and patent portfolio, and was named an IEEE Fellow in 2005. He received his PhD from Georgia Tech in 1992 after undergraduate and graduate degrees from Rice University.