Information Geometry and Some Variations on the EM Algorithm

Bill Byrne of CLSP/JHU
at the CLSP/JHU Summer Research Workshop on August 15, 1997 at 1:30 pm, Barton Hall 117.

Information Geometry and Some Variations on the EM Algorithm


I will present the minimum divergence information formulation of the EM algorithm and then introduce an extension of this description which I call the Generalized Alternating Minimization procedure. GAM incorporates the well-known GEM algorithm which allows variations on the M-Step of the EM procedure and introduces variations on the E-Step to allow operations such as re-sampling the training data.

This formulation allows uniform treatment of a variety of training procedures and likelihood criteria. For example, Viterbi, Baum Welch and MMI HMM reestimation are GAM procedures. In particular, the GAM formulation provides insight into the fixed point behavior of the Viterbi and Baum Welch training procedures.

I will give some novel convergence results for GAM estimation procedures will also be presented. As an extension of the EM algorithm, GAM does not in general increase likelihood at each iteration so converge to a local maximum in likelihood is not guaranteed. I'll discuss some conditions which ensure convergence.

I hope to keep the presentation informal. My intent is to describe some new ideas that provide insight into well known algorithms. I hope also to show the power of information geometric arguments in discussing learning procedures.