CLSP Web SiteWS 98 Site Map
CLSP
logo
WS 98 Research Projects
An NSF Workshop: Language Engineering for Students
and Professionals Integrating Research and Education 
 
 
Four Normalization of Non-standard Words
Team Goals
Project Description
Team Members
Richard Sproat
Team Leader
AT&T Labs rws@research.att.com
Alan Black Univ. of Edinburgh/CMU awb@cs.cmu.edu
Stanley Chen CMU sfc@cs.cmu.edu
Shankar Kumar CLSP/JHU skumar@mail.clsp.jhu.edu
Mari Ostendorf Boston University mo@raven.bu.edu
Christopher Richards Williams College crichard@wso.williams.edu
 
Technical Papers & Resources
Labelling Guide for Non-Standard Words [1]
Slides
  • Introduction to Non-Standard Words (Alan Black) [ps] [pdf]
  • First-Day Presentation (Richard Sproat) [ps] [pdf]
  • First Progress Report (Richard Sproat) [ps] [pdf]
  • Second Progress Report (Alan Black) [ps] [pdf]
  • Second Progress Report (Chris Richards) [ps] [pdf]
  • Second Progress Report (Shankar Kumar) [ps] [pdf]
  • Second Progress Report (Richard Sproat) [ps] [pdf]
Building Language Models [dvi]
Lattice Rescoring [dvi]
Lextools
Guide to Evaluating Text Normalization [dvi]
Final report [ps] [pdf]
Sample output of unsupervised abbreviation expansion (Section 7.6 of report)
Unsupervised abbreviation expansion methodology applied to letter-sequences/acronyms from the NANTC corpus
Final Presentation
 
[1] Speakers of American English may wish to consult the Labeling Guide for Non-Standard Words.
 

The Center for Language and Speech Processing

Johns Hopkins University

3400 N. Charles Street, Barton Hall, Baltimore, MD 21218 
Telephone: 410 516 4237 Fax: 410 516 5050 E-mail: clsp@jhu.edu
CLSP We











b SiteWS 98 Site Map