Bayesian Networks: Algorithms and Structures for ASR – Geoffrey Zweig (T.J. Watson Research Center, IBM)
View Seminar Video
This talk will describe Bayesian networks, and place them in the context of automatic speech recognition. The Bayesian network formalism has both representational and algorithmic components, and the talk will touch on each. Representationally, the networks provide a graphical way of factoring a joint probability distribution. The nodes in a Bayes net graph represent random variables, whose values can be either known or unknown. The arcs in the graph factor the joint distribution into a product of localized conditional probabilities, each of which involves only a few variables. The conditional probabilities can be represented with either tables, Gaussians, or any other convenient function. Algorithmically, there are elegant and efficient procedures for computing marginal distributions over the values of the hidden variables, and for finding the likeliest assignment of values. In ASR, these lead directly to the computation of state-occupancy probabilities and Viterbi decodings.The key feature of Bayesian networks is that the algorithms are parameterized on the graph structure and the representation of conditional probabilities. This makes it very easy to explore a variety of probabilistic models with a minimum of code-writing. In addition to describing the basic algorithms, the talk will relate Bayesian networks to the HMMs currently in use in ASR, and show how they provide a simple method for extending HMMs to model phenomena such as rate-of-speech and articulatory motion.