 |
|
|
|
|
|
|
NEW: software tools developed during the workshop now available for download!
EGYPT
 |
a statistical machine translation toolkit
Statistical Machine Translation |
 |
 |
|
|
|
|
 |
Project Description (old) |
Project Goals (7/12/99) |
Subtle Ad |
|
|
 |
|
|
|
|
|
 |
 |
 |
|
|
|
|
|
Kevin Knight, USC/ISI
Team Leader |
knight@isi.edu and knight@clsp.jhu.edu |
|
|
|
Yaser Al-Onaizan, USC/ISI |
yaser@isi.edu and yaser@clsp.jhu.edu |
|
|
|
David Purdy, DoD |
dpurdy_smt@rofti.org and dpurdy@clsp.jhu.edu |
|
|
|
Jan Curin, Charles Univ, CR |
curin@ufal.ms.mff.cuni.cz and curin@clsp.jhu.edu |
|
|
|
Michael Jahr, Stanford |
mjahr@stanford.edu and jahr@clsp.jhu.edu |
|
|
|
John Lafferty, CMU |
lafferty@cmu.edu and lafferty@clsp.jhu.edu |
|
|
|
Dan Melamed, West Group |
|
|
|
Noah Smith, UMD |
nasmith@cs.umd.edu and nasmith@clsp.jhu.edu |
|
|
|
Franz Josef Och, RWTH Aachen |
och@i6.informatik.rwth-aachen.de and och@clsp.jhu.edu |
|
|
|
David Yarowsky, CLSP/JHU |
yarowsky@blaze.cs.jhu.edu
|
|
|
 |
 |
| |
 |
NEW!! Downloadable software tools developed during the workshop |
| |
 |
FINAL REPORT, 12/11/99 |
|
|
 |
First Planning Meeting (April, 1999) |
|
|
 |
Second Planning Meeting (May, 1999) |
|
|
 |
First Project Report (July 24, 1999) |
|
|
 |
Second Project Report (August 11, 1999) |
|
|
 |
"A Statistical MT Tutorial Workbook".
Click here for Word version.
|
|
|
 |
Initial Translation Model To-Do List |
|
|
 |
User-Level Documentation: Translation Model Training |
|
|
 |
Code-Level Documentation: Translation Model Training |
|
|
 |
"The Mathematics of Statistical Machine Translation" (P. Brown, S. Della Pietra, V. Della Pietra, R. Mercer), appeared in Computational Linguistics 19(2), 1993. |
|
|
 |
User-Level Documentation: Language Model Training |
|
|
 |
Code-Level Documentation: Language Model Training |
|
|
 |
User-Level Documentation: Decoding |
|
|
 |
Code-Level Documentation: Decoding |
|
|
 |
Documented Script: Analyzing Rough Features of Corpus |
|
|
 |
Whittle: Corpus Preparation Tool |
|
|
 |
Other Useful Corpus-Preparation Scripts
(prep and
split)
|
|
|
 |
Facts about the Czech/English Corpus |
| |
 |
Facts about the 50K French/English Corpus, very rare words replaced by UNK |
| |
 |
Facts about the 100K French/English Corpus, very rare words replaced by UNK |
| |
 |
Facts about the 200K French/English Corpus, very rare words replaced by UNK |
| |
 |
Facts about the 300K French/English Corpus, very rare words replaced by UNK |
| |
 |
Facts about the 400K French/English Corpus, very rare words replaced by UNK |
| |
 |
Facts about the 500K French/English Corpus, very rare words replaced by UNK |
|
|
 |
Cairo: Alignment Inspection Tool
(see sample screen dump)
|
|
|
 |
How to Put Alignments in
Cairo-Viewable Format
|
|
|
 |
Czech Environment Settings |
|
|
 |
Evaluation of Czech/English Translations |