The Center for Language and Speech Processing




About CLSP
About CLSP
About CLSP
Workshops
Research
Upcoming Seminar

Bill Byrne
November 24th
4:30PM

"Hierarchical Phrase-based Translation with Weighted Finite State Transducers "

More information »

Research

A substantial part of research at CLSP is carried out in conjunction with the summer workshop program. CLSP personnel have been instrumental in much of the research carried out at the workshops. For example, in 1997 the pronunciation modeling projects and the hidden mode modeling project benefited from the direct participation of CLSP researchers. Other projects, such as the syllable modeling projects and the discourse modeling projects, relied on ASR tools, models, and data provided by CLSP researchers. In addition to the workshops, the speech research conducted at CLSP has addressed the very difficult problem of conversational speech recognition over the telephone. Novel ideas like nonreciprocal data sharing as an alternative to parameter tying, use of prosodic information for modeling pronunciation variation, a structured language model for exploiting syntactic dependencies, etc. are being investigated and are expected to lead to substantial advances in the state-of-the-art.

Natural Language Processing research carried out at CLSP has had broad influence on the NLP community. For example, in the field's most recent international conference, researchers outside of CLSP presented papers adapting a particular learning algorithm (Transformation-Based Learning) developed by members of CLSP to a diverse array of tasks, including: base noun phrase identification, word sense disambiguation, Korean morphological analysis, dialogue act tagging, and subordinate conjunction parsing. Lexical processing software that we have made freely available has been used by hundreds of sites, and was the basis for a message understanding system developed at Mitre. Our work on word sense disambiguation and other types of lexical ambiguity resolution has also had broad impact, and our algorithms for topic-based and decision-list-based classification have been integrated into several commercial speech synthesizers and language annotators.