An
NSF
Workshop: Language Engineering for Students
and Professionals Integrating Research and Education
Normalization of Non-standard Words
Project Description
Richard Sproat
Team Leader
AT&T Labs
rws@research.att.com
Alan Black
Univ. of Edinburgh/CMU
awb@cs.cmu.edu
Stanley Chen
CMU
sfc@cs.cmu.edu
Shankar Kumar
CLSP/JHU
skumar@mail.clsp.jhu.edu
Mari Ostendorf
Boston University
mo@raven.bu.edu
Christopher Richards
Williams College
crichard@wso.williams.edu
Labelling Guide for Non-Standard Words
[1]
Slides
Introduction to Non-Standard Words (Alan Black)
[ps]
[pdf]
First-Day Presentation (Richard Sproat)
[ps]
[pdf]
First Progress Report (Richard Sproat)
[ps]
[pdf]
Second Progress Report (Alan Black)
[ps]
[pdf]
Second Progress Report (Chris Richards)
[ps]
[pdf]
Second Progress Report (Shankar Kumar)
[ps]
[pdf]
Second Progress Report (Richard Sproat)
[ps]
[pdf]
Building Language Models
[dvi]
Lattice Rescoring
[dvi]
Lextools
Guide to Evaluating Text Normalization
[dvi]
Final report
[ps]
[pdf]
Sample output of unsupervised abbreviation expansion (Section 7.6 of report)
Unsupervised abbreviation expansion methodology applied to letter-sequences/acronyms from the NANTC corpus
Final Presentation
[1]
Speakers of American English may wish to consult the
Labeling Guide for Non-Standard Words
.
The Center for Language and Speech Processing
Johns Hopkins University
3400 N. Charles Street, Barton Hall, Baltimore, MD 21218
Telephone: 410 516 4237 Fax: 410 516 5050 E-mail:
clsp@jhu.edu