New Machine Learning Tools for Structured Prediction – Veselin Stoyanov (Johns Hopkins HLTCOE)

November 6, 2012 all-day

View Seminar Video
I am motivated by structured prediction problems in NLP and social network analysis. Markov Random Fields (MRFs) and other Probabilistic Graphical Models (PGMs) are suitable for representing structured prediction: they can model joint distributions and utilize standard inference procedures. MRFs also provide a principled ways for incorporating background knowledge and combining multiple systems.Two properties of structured prediction problems make learning challenging. First, structured prediction almost inevitably requires approximation to inference, decoding or model structure. Second, unlike the traditional ML setting that assumes i.i.d. training and test data, structured learning problems often consist of a single example used both for training and prediction.We address the two issues above. First, we argue that the presence of approximations in MRF-based systems requires a novel perspective on training. Instead of maximizing data likelihood, one should seek the parameters that minimize the empirical risk of the entire imperfect system. We show how to locally optimize this risk using error back-propagation and local optimization. On four NLP problems our approach significantly reduces loss on test data compared to choosing approximate MAP parameters.Second, we utilize data imputation in the limited data setting. At test time we use sampling to impute data that is a more accurate approximation of the data distribution. We use our risk minimization techniques to train fast discriminative models on the imputed data. With this we can: Train discriminative models given a single training and test exampletrain generative/discriminative hybrids that can incorporate useful priors and learn from semi-supervised data
Veselin Stoyanov is a postdoctoral researcher at the Human Language Technology Center of Excellence (HLT-COE) at Johns Hopkins University (JHU). Previously he spent two years working with Prof. Jason Eisner at JHU’s Center for Language and Speech Processing supported by a Computing Innovation Postdoctoral Fellowship. He received the Ph.D. degree from Cornell University under the supervision of Prof. Claire Cardie in 2009 and the Honors B.Sc. from the University of Delaware in 2002. His research interests reside in the intersection of Machine Learning and Computational Linguistics. More precisely, he is interested in using probabilistic models for complex structured problems with applications to knowledge base population, modeling social networks, extracting information from text and coreference resolution. In addition to the CIFellowship, Ves Stoyanov is the recipient of an NSF Graduate Research Fellowship and other academic honors.

Center for Language and Speech Processing