Translingual Information Processing
March 29, 2005
Abstract
Searching unstructured information in the form of _LP_largely_RP_ text with
increasing image, audio, and video content is fast becoming a daily
activity for many people. Increasingly, the content is becoming
multilingual _LP_e.g. one such trend is that non-english speakers became
the majority of online users in the summer of 2001 and continue to
increase their share reaching two-thirds today_RP_. To help assist users
with accessing answers to their information needs regardless of the
original language of the relevant content, we at IBM Research have a
number of projects to handle multilingual content ranging from machine
translation, information extraction, to topic detection and tracking.
In this talk, we will present an overview of our work on statistical
machine translation and demonstrate a cross-lingual search engine to
search Arabic content using English queries.