Student seminar: Hainan Xu, “Modeling Phonetic Context with Non-random Forests for Speech Recognition”
Baltimore, MD, 21218
Modern speech recognition systems typically cluster triphone phonetic contexts using decision trees. In this talk I will give a brief introduction of how decision trees are used in speech recognition and describe a way to build multiple complementary decision trees from the same data, for the purpose of system combination. We do this by jointly building the decision trees using an objective function that has an added entropy term to encourage diversity among the decision trees. After the trees are built, the systems are built in the standard way and the emission probabilities are combined during decoding. Experiments show consistent gains from the use of multiple trees on multiple datasets.