Segmentation, Indexing, and Retrieval of Multilingual Multimedia Sources:
David Palmer
- 08/12/2002
slides from David Palmer's lecture (.pdf format)
- Location: Shaffer Hall, Room 101
- Time: 2:30 pm - 3:30 pm
- Abstract:
Television news broadcasts contain a wide variety of video, audio, and natural language features that can be used to segment, classify, summarize, and index the content. However, research on news broadcasts frequently focuses on only one aspect of the available multimedia information, without correlating valuable information across sources.
This talk presents details of the Virage Video, Text, and Audio Processing (ViTAP) system, a research prototype that automatically detects, combines, and correlates metadata derived from a range of information sources in news broadcasts. The ViTAP system, based upon a news-on-demand media processing system funded by DARPA under the TIDES program, provides real-time segmentation and indexing of news broadcasts and enables searching and retrieval of relevant stories within a large collection of broadcasts.
The ViTAP system combines research and commercial audio and language processing systems, such as speech recognition, machine translation, speaker ID, named entity detection, as well as video and image processing technologies, such as face ID, onscreen logo ID, and onscreen OCR. The ViTAP system is designed for multilingual news processing and currently captures and indexes 54 news broadcasts daily from 6 English-, Arabic-, and Mandarin-language satellite sources. The system is built upon Virage's open architecture such that new components (e.g., speech recognition and machine translation for new languages) can easily be integrated.
- Biography:
David Palmer is currently a Senior Speech Researcher in the Advanced Technology Group at Virage in Woburn, MA. His work at Virage centers on the integration of speech and language technologies into automated video and audio processing systems. Previously, David was a Lead Scientist in the Natural Language Processing Research Group at MITRE in Bedford, MA, where he carried out independent research in speech recognition and statistical methods for automated language processing. He received his PhD in Electrical Engineering from the University of Washington, where he worked with Mari Ostendorf. He also holds an M.S. in Computer Science from UC-Berkeley and a B.S. in Electrical Engineering from Penn State, and he has studied in Germany at the Technical University of Munich as a Fulbright scholar.
|