We describe the design and use of a decision tree to quantize the input
feature space of a classifier. The decision tree asks questions about a
multi-dimensional feature vector, with the questions being designed at
every stage of the tree growing process, rather than being picked from a
predetermined set of questions. The quantization information provided by
the decision tree is used to eliminate a number of classes from being
considered, and hence simplifies the task of the classifier. We show that
computation in a speech recognition system can be reduced by a factor of
20 with negligible degradation in classification accuracy by using such
trees in preprocessing the acoustic features.
Part II - Speaker clustering and transformation for adaptation in ASR
We describe a speaker adaptation strategy is described that is based on first finding a subset of training speakers who are acoustically close to the test speaker. A linear transformation is computed for each selected training speaker to better map the training speaker's data to the test speaker's acoustic space. The system parameters (Gaussian means) are then re-estimated for the test speaker using the transformed training data from only the selected training speakers. Experiments show that this scheme is capable of providing relative improvements in the error rate of 18% on a large-vocabulary task with the use of as little as 3 sentences of adaptation data from the test speaker.