CLSP Homepage : Workshop Homepage
Workshop 2002
Preworkshop Lecture Friday, August 29, 2008


Jump To:

Seminar
Information
Text and Audio Processing for Information Management: Francis Kubala - 07/08/2002
  • Abstract:

    Today, there is vigorous research and development activity underway to make use of the unstructured information contained in audio and text sources. This undertaking is very broad and inter-disciplinary in scope. Specialists in the historically separated disciplines of automatic speech recognition and computational linguistics are now collaborating closely to attack the general problem of capturing important information in human language. At the same time, there are energetic efforts underway to transition new language technologies out of the laboratories and into operational prototypes as fast as possible. This has brought specialists in distributed computing into close collaboration with the speech and language experts. It also has created a virtual spiral development cycle around the traditionally independent research and development cycles. We can now begin to see feedback from end users influencing the direction of the research. It is our belief that this overall state of affairs is an extremely positive one for the immediate future of speech and language technology R&D.

    This talk will focus on a DARPA sponsored project at BBN that illustrates the broad scope of today’s speech and language R&D as described above. The objective of this project is to integrate, evaluate, and transition advanced human language information technologies that are emerging from the DARPA TIDES, EELD, and EARS programs. The centerpiece of this project is a continuously evolving demonstration system called TIDES OnTAP (Online Text and Audio Processing) that is used as a research test bed to investigate appropriate system architectures, usage models, and evaluation methodologies in a simulated operational environment. OnTAP is a distributed Internet system that continuously processes incoming open-source text and audio information in English and Arabic. The system’s Browser-based User Interface is designed to help information managers efficiently locate, organize, and understand the information contained in high-volume flows of multi-lingual content from mixed-media sources. The OnTAP demonstration system will be used in the talk to interactively show how advanced speech and language technologies are being rapidly integrated, evaluated, and transitioned.

     




The Center for Language and Speech Processing
The Johns Hopkins University
3400 North Charles Street, Barton Hall
Baltimore, MD 21218
*Telephone: (410) 516-4237 *Fax: (410) 516-5050 *E-mail: clsp@clsp.jhu.edu