Today’s recognizers are based on single pronunciations for most words. There are certain types of pronunciation variation (phone deletion/reduction, dialect) that are impossible to model at the acoustic level. The goal of this project is to learn automatically models of word pronunciation from data. For the Switchboard and Callhome corpora, a small number of words make up a large fraction of the total words spoken. To start with, the pronunciations of these errorful words will be learned. Pronunciation variants will no longer be treated as mutually independent, i.e., under the assumption that any speaker would choose one of the given variants with a given probability, independent of related choices made by him in the same conversation.