Markus Dreyer
I am a Ph.D. student in Natural Language Processing
(NLP)
at the Johns Hopkins University
(JHU) and member of
the CLSP and
the HLTCOE.
Here is my curriculum vitae.
Email: m...@gmail.com.
News: I defended my dissertation in 2010 and started as a Research Scientist at SDL Language Weaver. Research Interests
Natural language processing, machine translation,
computational morphology, machine learning, finitestate
modeling, parsing

Publications

•
"Discovering Morphological Paradigms from Plain Text Using a Dirichlet Process Mixture Model"
Markus Dreyer and Jason Eisner (2011).
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Edinburgh.
[pdf] 
•
"Hill Climbing on Speech Lattices: A New Rescoring Framework"
A. Rastrow, M. Dreyer, A. Sethy, S. Khudanpur, B. Ramabhadran and M. Dredze (2011).
In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague.
[pdf] 
•
"A NonParametric Model for the Discovery of Inflectional Paradigms
from Plain Text Using Graphical Models over Strings"
Markus Dreyer (2011).
Ph.D. Thesis, JHU, Baltimore.
[website] 
•
"Graphical Models over Multiple Strings"
Markus Dreyer and Jason Eisner (2009).
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Singapore.
[pdf  bib  slides: keynote'09, mov small/large, pdf export] 
•
"LatentVariable Modeling of String Transductions With FiniteState Methods."
Markus Dreyer, Jason Smith, Jason Eisner (2008).
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Honolulu, Hawaii.
[pdf (small fix in Fig. 1)  bib] 
•
"Machine Translation System Combination using ITGbased Alignments"
Damianos Karakos, Jason Eisner, Sanjeev Khudanpur and Markus Dreyer (2008).
In Proceedings of the Conference of the Association for Computational Linguistics (ACL), Columbus, Ohio.
[pdf  bib] 
•
"Exploiting Prosody for PCFGs with Latent Annotations"
Markus Dreyer and Izhak Shafran (2007).
In Proceedings of Interspeech, Antwerp, Belgium.
[pdf  bib] 
•
"Comparing Reordering Constraints for SMT Using Efficient BLEU Oracle Computation"
Markus Dreyer, Keith Hall, Sanjeev Khudanpur (2007).
In Proceedings of the HLTNAACL Workshop on Syntax and Structure in Statistical Translation (SSST), Rochester, New York.
[pdf  bib] 
•
"Better Informed Training of Latent Syntactic Features"
Markus Dreyer and Jason Eisner (2006).
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Sydney, Australia.
[pdf  bib] 
•
"Vine Parsing and Minimum Risk Reranking for Speed and Precision"
Markus Dreyer, David A. Smith, Noah A. Smith (2006).
In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL), New York.
[pdf  bib  slides] 
•
"Statistical Machine Translation by Parsing"
A. Burbank, M. Carpuat, S. Clark, M. Dreyer, P. Fox, D. Groves, K. Hall, M. Hearne, I. D. Melamed, Y. Shen, A. Way, B. Wellington, and D. Wu (2005).
CLSP Technical Report.
[pdf]
Software

•
fstrain
I wrote this toolkit for efficient training of finitestate machines in C++. It includes an implementation of the expectation semiring, used with OpenFst, to represent and manipulate finitestate machines with (loglinear) features. It uses R for parameter optimization and can handle potentially divergent objective functions.
You can use fstrain to train globally normalized sequence models (e.g. for POS tagging or NER), or stringtostring transductions that may include deletions and insertions (e.g. for lemmatization), or train simple maxent classifiers. It is always possible to compose several smaller models and train them jointly, e.g. for a factorial CRF. The tarball contains a README and doxygen documentation. I hope to add some tutorialstyle documentation in the near future.
Download: [fstrain0.1.tar.gz] 
•
dyna
I am a member of the Dyna team. Some time ago, I wrote the Dyna frontend parser, the type inference system and some program transformations, like automatic binarization of dynamic programs.
Contact Information
Center for Language and Speech Processing
3400 N. Charles Street, CSE 321
Baltimore, MD 212182691
Email: m...@gmail.com
Phone: 4105166837
Fax : 4105165050
3400 N. Charles Street, CSE 321
Baltimore, MD 212182691
Email: m...@gmail.com
Phone: 4105166837
Fax : 4105165050