Using random matrix theory, we now have some very easy to understand and fast to use methods of computing low rank representations of matrices. I have been using these methods as a hammer to improve several statistical methods: stepwise regression, CCAs, and HMM. I’ll discuss a few of these in this talk and how they can be connected to problems in NLP.
Much of Dean’s current work is on statistical approaches to NLP problems and other issues in big data. He has come up with several algorithms for fast variable selection in regressions and has proven these to have nice theoretical properties. He has used vector models for words to allow them to be more easily manipulated using statistical technology. These often end up using spectral techniques, for example, as he used them to fit HMMs and probabilistic CFG.