CLSP Web SiteWS 98 Site Map
WS 98 Research Projects
An NSF Workshop: Language Engineering for Students
and Professionals Integrating Research and Education 
Four Normalization of Non-standard Words
Team Goals
Project Description
Team Members
Richard Sproat
Team Leader
AT&T Labs
Alan Black Univ. of Edinburgh/CMU
Stanley Chen CMU
Shankar Kumar CLSP/JHU
Mari Ostendorf Boston University
Christopher Richards Williams College
Technical Papers & Resources
Labelling Guide for Non-Standard Words [1]
  • Introduction to Non-Standard Words (Alan Black) [ps] [pdf]
  • First-Day Presentation (Richard Sproat) [ps] [pdf]
  • First Progress Report (Richard Sproat) [ps] [pdf]
  • Second Progress Report (Alan Black) [ps] [pdf]
  • Second Progress Report (Chris Richards) [ps] [pdf]
  • Second Progress Report (Shankar Kumar) [ps] [pdf]
  • Second Progress Report (Richard Sproat) [ps] [pdf]
Building Language Models [dvi]
Lattice Rescoring [dvi]
Guide to Evaluating Text Normalization [dvi]
Final report [ps] [pdf]
Sample output of unsupervised abbreviation expansion (Section 7.6 of report)
Unsupervised abbreviation expansion methodology applied to letter-sequences/acronyms from the NANTC corpus
Final Presentation
[1] Speakers of American English may wish to consult the Labeling Guide for Non-Standard Words.

The Center for Language and Speech Processing

Johns Hopkins University

3400 N. Charles Street, Barton Hall, Baltimore, MD 21218 
Telephone: 410 516 4237 Fax: 410 516 5050 E-mail:

b SiteWS 98 Site Map