Automatic Learning of Word Pronunciation from Data

Today’s recognizers are based on single pronunciations for most words. There are certain types of pronunciation variation (phone deletion/reduction, dialect) that are impossible to model at the acoustic level. The goal of this project is to learn automatically models of word pronunciation from data. For the Switchboard and Callhome corpora, a small number of words make up a large fraction of the total words spoken. To start with, the pronunciations of these errorful words will be learned. Pronunciation variants will no longer be treated as mutually independent, i.e., under the assumption that any speaker would choose one of the given variants with a given probability, independent of related choices made by him in the same conversation.


Team Members 
Senior Members
Sanjeev KhudanpurCLSP
Charles GallesDoD
Yu-Hung KaoTI
Steven WegmannDragon
Mitch WeintraubSRI
Graduate Students
Murat SaraçlarCLSP
Eric FoslerICSI

Johns Hopkins University

Johns Hopkins University, Whiting School of Engineering

Center for Language and Speech Processing
Hackerman 226
3400 North Charles Street, Baltimore, MD 21218-2680

Center for Language and Speech Processing