Transcribing and Retrieving Broadcast News
Steve Young, Department of Engineering, Cambridge University & Entropic Limited, UK
December 8, 1998
As computing and communications continue to converge, on-line access to large databases of audio and video becomes increasingly possible. However, searching this kind of material by content is difficult since unlike conventional documents, there is no text to search.
This talk will describe current work at Cambridge which uses large vocabulary speech recognition to automatically index the audio soundtrack of radio and television broadcast news in order to subsequently retrieve news stories by content. The recognition system will be described and the design of retrieval software for use with the automatically derived transcriptions will be discussed.
This work is part of an EPSRC Project entitled "Multimedia Document Retrieval" in collaboration with Cambridge University Computer Laboratory, Olivetti and Oracle Research Lab and Entropic Cambridge Research Lab.