Advanced Topics: Integrating High-level Information for Robust Speaker Recognition Larry Heck - 07/10/2002
slides from Larry Heck's lecture
- Abstract:
To identify or verify a person based on their voice, state-of-the-art automatic speaker recognition systems rely almost exclusively on low-level information contained in the speech signal. The low-level information is extracted over short intervals of time (e.g., 20-30 ms) and is typically based on the spectral engergies present in the speech interval. While systems can achieve commercially viable performance using this low-level approach, recent work has shown significant gains in accuracy and robustness through the inclusion of higher levels of information available in the speech signal. These include prosodics and idiosyncratic choices or sequences of phonemes/words/phrases, amongst others.
This lecture will highlight some of the past and ongoing research in the integration of high-level information for robust speaker recognition, with a specific focus on the work at SRI and Nuance over the past 5 years.
- Biography:
Larry Heck received the MS and PhD in Electrical Engineering from the Georgia Institute of Technology in 1989 and 1991, respectively, and a BS in Electrical Engineering from Texas Tech University in 1986. He is currently Director of Speech R&D at Nuance Communications, leading Nuance's R&D efforts in speech reecognition and voice authentication. Beginning in 1992, Larry worked at SRI International and served as principal investigator for a number of federally funded research programs in acoustics and speech, including active noise and vibration control, acoustic machinery monitoring, and speaker recognition. He started the SRI program for advanced research in speaker recognition algorithms, and led numerous multiyear projects funded by the NSA, ORD/CIA, US Marines, and SRI IR&D. He invented several of the key algorithms in the robust speaker recognition literature, with particular focus on detection and compensation for distortion over wire line and wireless telephone communication channels. He introduced voice authentication to Nuance in 1995, and spearheaded the initial product and business development efforts for Nuance Verifier. He is a member of the IEEE Signal Processing Society's Speech Technical Committee, on the board of the Speaker and Language Characterization Special Interest Group of the International Speech Communication Association, and served on the scientific committee for the 2001 IEEE/ISCA Speaker Recognition "Odyssey" Workshop. He has published over 50 articles in acoustics and speech processing, and has 6 patents granted/pending in speech and speaker recognition.
|