Archived Seminars by Year
January 28, 2003 | 12:00 pm
Patrick Haffner, AT&T Research
AbstractJoint work with Corinna Cortes and Mehryar Mohri. Kernel methods are widely used in statistical learning techniques due to their excellent performance and their computational efficiency in high-dimensional feature space. However, text or speech data cannot always be represented by the fixed-length vectors that the traditional kernels handle. In this talk, we introduce a general framework, Rational Kernels, that extends kernel techniques to deal with variable-length sequences and more generally to deal with large sets of weighted alternative sequences represented by weighted automata. Far from being abstract and computationally complex objects, rational kernels can be readily implemented using general weighted automata algorithms that have been extensively used in text and speech processing and that we will briefly review. Rational kernels provide a general framework for the definition and design of similarity measures between word or phone lattices particularly useful in speech mining applications. Viewed as a similarity measure, they can also be used in Support Vector Machines and significantly improve the spoken-dialog classification performance in difficult tasks such as the AT&T 'How May I Help You' (HMIHY) system. We present several examples of rational kernels to illustrate these applications. We finally show that many string kernels commonly considered in computational biology applications are specific instances of rational kernels.
Speaker BiographyPatrick Haffner graduated from Ecole Polytechnique, Paris, France in 1987 and from Ecole Nationale Superieure des Telecommunications (ENST), Paris, France in 1989. He received his PhD in speech and signal processing from ENST in 1994. His research interests center on statistical learning techniques that can be used to globally optimize real-world processes with speech or image input. With France Telecom Research, he developed multi-state time-delay neural networks (MS-TDNNs) and applied them to recognize telephone speech. In 1995, he joined AT&T Laboratories, where he worked on image classification using convolutional neural networks (with Yann LeCun) and Support Vector Machines (with Vladimir Vapnik). Using information theoretic principles, he also developed and implemented the segmenter used in the DjVu document compression system. Since 2001, he has been working on kernel methods and information theoretic learning for spoken language understanding.
February 4, 2003 | 12:00 pm
Jan Hajic, Institute of Formal and Applied Linguistics, Charles University
AbstractThe so-called "Tectrogrammatical" representation of natural language sentence structure will be described as it is being developed for the Prague Depepndency Treebank and used for the Czech, English and German languages, with other languages in the works and plans (Arabic, Slovenian). The Tectogrammatical representation attempts at a semi-universal representation of such language phenomena as the predicate-argument structure, lexical semantics, discourse structure displayed at the sentence level, and co-reference both inside and across sentences. Its relation to the classical dependency- and parse-tree representation of (surface) sentence structure will be presented as well. Possible advantages of the tectogrammatical representation will be demonstrated on examples of the Machine Translation and Question Answering tasks.
Speaker BiographyMore biographical information can be found here.
February 11, 2003 | 12:00 pm
Gert Cauwenberghs, Johns Hopkins University
AbstractRecently it has been shown that a simple learning paradigm, the support vector machine (SVM), outperforms elaborately tuned expert systems and neural networks in learning to recognize patterns from sparse training examples. Underlying its success are mathematical foundations of statistical learning theory. I will present a general class of kernel machines that fit the statistical learning paradigm, and that extend to class probability estimation and MAP forward sequence decoding. Sparsity in the kernel expansion (number of support vectors) relates to the shape of the loss function, and (more fundamentally) to the rank of the kernel matrix. Applications will be illustrated with examples in image classification and phoneme sequence recognition. I will also briefly present the Kerneltron, a silicon support vector "machine" for high-performance, real-time, and low-power parallel kernel computation.
Speaker BiographyDr. Cauwenberghs' research focuses on algoritms, architectures and VLSI systems for signal processing and adaptive neural computation, including speech and acoustic processors, focal-plane image processors, adaptive classifiers, and low-power coding and instrumentation. He has served as chair of the Analog Signal Processing Technical Committee of the IEEE Circuits and Systems Society, and is associate editor of the IEEE Transactions of Circuits and Systems II: Analog and Digital Signal Processing and the newly established IEEE Sensors Journal. More biographical information can be found here.
February 18, 2003 | 12:00 pm
Yorick Wilks, Department of Computer Science, University of Sheffield
AbstractIn this paper we present initial results from the METER (MEasuring TExt Reuse) project whose aim is to explore issues pertaining to text reuse and derivation, especially in the context of newspapers using newswire sources. Although the reuse of text by journalists has been studied in linguistics, We are not aware of the investigation using existing computational methods for this particular task and context. In this paper we concentrate on classifying newspapers according to their dependency upon PA copy using a 3-class document-level scheme designed by domain experts from journalism and a number of well-known approaches to text analysis. We show that the 3-class document-level scheme is better implemented as 2 binary Naive Bayes classifiers and gives an F-measure score of 0.7309.
Speaker BiographyMore biographical information can be found here.
February 25, 2003 | 12:00 pm
Sergei Nirenburg, University of Maryland, Baltimore County
AbstractThe term ontological semantics refers to the apparatus of describing and manipulating meaning in natural language texts. Basic ontological-semantic analyzers take natural language texts as inputs and generate machine-tractable text meaning representations (TMRs) that form the basis of various reasoning processes. Ontological-semantic text generators take TMRs as inputs and produce natural language texts. Ontological-semantic systems centrally rely on extensive static knowledge resources: a language-independent ontology, the model of the world that includes models of intelligent agents; ontology-oriented lexicons (and onomasticons, or lexicons of proper names) for each natural language in the system; and a fact repository consisting of instances of ontological concepts as well as remembered text meaning representations. Applications of ontological semantics include knowledge-based machine translation, information retrieval and extraction, text summarization, ontological support for reasoning systems, including networks of human and software agents, general knowledge management and others. In this talk I will give a broad overview of some of the ontological-semantic processing and static resources and discuss .
Speaker BiographyBiographical information coming soon.
March 4, 2003 | 02:00 pm
Michael Tanenhaus, University of Rochester
AbstractAll current models of spoken word recognition assume that as speech unfolds multiple lexical candidates become partially activated and compete for recognition. However, models differ on fundamental questions such as the nature of the competitor set, the temporal dynamics of word recognition, how fine-grained acoustic information is used in discriminating among potential candidates, and how acoustic input is combined with information from the context of the utterance. I.ll illustrate how each of these issues is informed by monitoring eye movements as participants follow instructions to use a computer mouse to click on and move pictures presented on a monitor. The timing and pattern of fixations allows for strong inferences about the activation of potential lexical competitors in continuous speech, while monitoring lexical access at the finest temporal grain to date, without interrupting the speech or requiring a meta-linguistic judgment. I.ll focus on recent work examining the effects on lexical access of fine-grained acoustic variation, such as coarticulatory information in vowels, within category differences in voice-onset time, interactions between acoustic and semantic constraints, and prosodic context.
Speaker BiographyBiographical information coming soon.
March 18, 2003 | 12:00 pm
Ciprian Chelba, Microsoft Research
AbstractA growing amount of information is stored in relational databases and many scenarios involving human-computer interaction by means of natural language can be distilled to the problem of designing interfaces to relational databases that are driven by natural language. The talk presents approaches to human-computer interaction by means of natural speech or free text. The methods described focus on relational domains --- such as Air Travel Information Systems (ATIS) --- where semantic models are well defined by simple entity-relationship diagrams (schemas). We distinguish between techniques that aim at classifying a speech utterance or typed sentence into some category (call/text routing) and higher resolution forms of information extraction from text or speech that aim at recovering more precise domain-specific semantic entities such as dates, city/airport names, airlines, etc. The first part of the talk will focus on simple speech utterance/text classification techniques such as n-gram, Na?ve Bayes, and Maximum Entropy. The second part outlines an attempt at using the structured language model (SLM) --- as a syntactic parser enriched with semantic tags --- for extracting fine-grained semantic information from text.
Speaker BiographyCiprian Chelba graduated from the Center for Language and Speech Processing at the Johns Hopkins University in January 2000. After graduation he joined the Speech Technology Group at Microsoft Research (http://research.microsoft.com~chelba). His core research interests are in statistical language and speech processing while the broader ones could be loosely described as statistical modeling. When not producing floating point numbers and trying to make sense out of them he goes out and enjoys outdoors activities such as hiking, tennis and skiing as well as a good play or movie.
April 1, 2003 | 12:00 pm
Adam Berger, Eizel Technologies, Inc.
AbstractThe mobile Internet promises anytime, anywhere, convenient access to email, web, and networked applications. Parts of this promise - high-throughput 2.5G and 3G wireless networks, richly-functional PDAs and phones - are already becoming available. But there remain several core technical problems hindering full-scale adoption of wireless data. One of these problems, for instance, is real-time document adaptation: how should a small-screen rendering algorithm adapt a hypertext document (web page, email message, etc.) which was designed for viewing on a standard PC display? Solving this problem draws on techniques in image processing, pattern recognition, networking, and of course, language processing. This talk introduces a proxy-based architecture designed to handle these kinds of problems. A mobile web proxy is a remote, high-performance agent, deployed on commodity PC or high-end dedicated hardware, which acts on behalf of a population of mobile device users. Demos as time permits.
Speaker BiographyAdam Berger is a founder and CTO of Eizel Technologies Inc. (www.eizel.com), a software firm whose products allow users to do new things with their mobile phones and PDAs. Adam's Ph.D. is from Carnegie Mellon University's School of Computer Science, where his research was at the intersection of machine learning and statistical language processing. Previously, he worked for several years in the statistical machine translation group at IBM's Thomas J. Watson Research Center, and held a research position at Clairvoyance Corporation, a Pittsburgh-based advanced technology firm specializing in information management.
April 1, 2003
April 15, 2003 | 12:00 pm
Edward Gibson, Department of Brain and Cognitive Sciences/Department of Linguistics and Philosophy, MIT
AbstractWhy is sentence (1) so much harder to understand than sentence (2)? The student who the professor who the scientist collaborated with had advised copied the article. The scientist collaborated with the professor who had advised the student who copied the article. The two sentences have very similar meanings, yet (1) is far more complicated than (2). In this presentation, I will present evidence from my lab for two independent factors in sentence complexity at play in sentences like (1) and (2) (Gibson, 1998, 2000): Integration distance between syntactic dependents. The processing cost of integrating a new word w is shown to be proportional to the distance between w and the syntactic head to which w is being integrated. The syntactic dependents in (1) are generally much further apart than they are in (2), making (1) more complex. Syntactic storage in terms of the number of partially processed syntactic dependents. Our evidence suggests that complexity increases as the number of predicted syntactic dependents increases. This factor also predicts greater complexity for (1) relative to (2). Evidence for these two factors will be provided in the form of reading times and comprehension questionnaires across a variety of English (Grodner & Gibson, 2002), Japanese (Nakatani & Gibson, 2003) and Chinese (Hsiao & Gibson, 2003) materials. Furthermore, recent evidence will be presented which helps to distinguish how distance is quantified, in terms of discourse structure (Warren & Gibson, 2002) and/or interfering similar syntactic elements (Gordon et al., 2001).
Speaker BiographyBiographical information can be found here.
April 22, 2003 | 12:00 pm
Tim DiLauro, Johns Hopkins University
AbstractThe Digital Knowledge Center (DKC) is the digital library research and development department of the Sheridan Libraries. The DKC's research agenda focuses on the ingestion of and access to materials in digital libraries. Its projects emphasize the development of automated tools, systems, and software to reduce the costs and resources required for converting the vast knowledge within print materials into digital form. Fundamentally, the DKC's R&D efforts emphasize a combination of automated technologies with strategic human intervention. The DKC: conducts research and development related to digital libraries in collaboration with faculty, librarians, and archivists both within and beyond Johns Hopkins University. provides expertise to facilitate the creation of digital library materials and services. focuses on assessment and evaluation of digital libraries through usability research and economic analyses. provides leadership in fostering an environment and culture which is conducive to advancing the library and university in the information age DKC projects have been funded by the National Science Foundation, the Institute of Museum and Library Services, the Mellon Foundation, a technology entrepreneurial group in Maryland, corporations and individual donors. The Hodson Trust has provided an endowment to support the Director of the DKC, and an Information Technology Assistant. The DKC has published numerous academic papers, and has been featured in articles or news stories by the New York Times, Baltimore Magazine, Tech TV, UPI, and the Congressional Internet Caucus. Tim DiLauro, the Deputy Director of the Digital Knowledge Center, will provide an overview of digital libraries and digital library research issues.
Speaker BiographyBiographical information coming soon.
April 29, 2003 | 12:00 pm
Allen Gorin, AT&T Research
AbstractA critical component of any business is interacting with their customers, either by human agents or via automated systems. Many of these interactions involve spoken or written language, with associated customer profile data. Current methods for analyzing, searching and acting upon these interactions are labor intensive and often based on small samples or shallow views of the huge volumes of actual data. In this talk I will describe research directed at enabling businesses to browse, prioritize, select and extract information from these large volumes of customer interactions. A key technical issue is that the data is hetereogenous, comprising both speech and associated call/caller data. Experimental evaluation of these methods on AT&T's 'How May I Help You?'(sm) spoken dialog system will be presented.
Speaker BiographyBiographical information coming soon.
May 6, 2003 | 02:00 pm
Tomaso Poggio, Center for Biological and Computational Learning, Artificial Intelligence Laboratory and McGovern Institute for Brain Research, Massachusetts Institute of Technology
AbstractThis seminar will be held in 210 Hodson Hall from 11 am to 12 pm. Refreshments available at 10:45am. The problem of learning is one of the main gateways to making intelligent machines and to understanding how the brain works. In this talk I will give a brief overview of our recent work on learning theory, including new results on predictivity and stability of the solution of the learning problem. I will then describe recent efforts in developing machines that learn in applications such as visual recognition and computer graphics. In particular, I will summarize our work on trainable, hierarchical classifiers for problems in object recognition and especially for face and person detection. I will also describe how we used the same learning techniques to synthesize a photorealistic animation of a talking human face. Finally, I will speculate briefly on the implication of our research on how visual cortex learns to recognize and perceive objects. Relevant papers can be downloaded from http://www.ai.mit.edu/projects/cbcl/publications/all-year.html.
Speaker BiographyTomaso A. Poggio is the Eugene McDermott Professor at the Department of Brain and Cognitive Sciences at MIT; he is director of the Center for Biological and Computational Learning; member of the Artificial Intelligence Laboratory and of the McGovern Institute for Brain Research. His work is motivated by the belief that the problem of learning is the gateway to making intelligent machines and understanding how the brain works. Research on learning in his group follows three directions: mathematics of learning theory and ill-posed problems, engineering applications (in computer vision, computer graphics, bioinformatics, datamining and artificial markets) and neuroscience of learning, presently focused on how visual cortex learns to recognize and represent objects.
July 2, 2003 | 04:30 pm
Dan Ellis, Columbia University
AbstractThe recently-established Laboratory for Recognition and Organization of Speech and Audio (LabROSA) at Columbia has the mission of developing techniques to extract useful information from sound. This covers a range of areas: General-purpose structure discovery and recovery, i.e. the basic segmentation problem over scales from subwords to episodes, and on both time and frequency dimensions; Source/object-based organization: Explaining the observed signal as the mixture of the independent sources that would be percieved by listeners; Special-purpose recognition and characterization for specific domains such as speech (transcription, speaker tracking etc.), music (classification and indexing), and other distinct categories. I will present more details on some current and new projects, including: Tandem acoustic modeling: Noise-robust speech recognition features calculated by a neural net. The Meeting Recorder project: Acoustic information extraction applications in a conventional meeting scenario. Machine Listening: Hearing for autonomous devices in the real world.
Speaker BiographyBiographical info coming soon. For more information, please see this webpage.
July 11, 2003 | 12:00 pm
Ralph Grishman, New York University
AbstractEvent extraction involves automatically finding, within a text, instances of a specified type of event, and filling a data base with information about the participants and circumstances (date, place) of the event. These data bases can provide an alternative, to traditional text search engines for repeated, focused searches on a single topic. Constructing an extraction system for a new event type requires identifying the linguistic patterns and classes of words which express the event. We consider the types of knowledge required and how this knowledge can be learned from text corpora with minimal supervision.
Speaker BiographyPlease see this webpage.
July 16, 2003
July 23, 2003
July 31, 2003
Jacob T Schwartz
August 6, 2003
“Hsinchun Chen: Crime Data Mining and Visualization for Intelligence and Security Informatics: The COPLINK Research”
August 20, 2003
August 20, 2003
August 20, 2003
August 21, 2003
Speaker Biography<iframe src="//player.vimeo.com/video/75063388" width="500" height="367" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>
August 26, 2003
September 16, 2003
October 14, 2003
Philipp Koehn, University of Southern California / ISI
AbstractI will review the state of the art in statistical machine translation (SMT), present my dissertation work, and sketch out the research challenges of syntactically structured statistical machine translation. The currently best methods in SMT build on the translation of phrases (any sequences of words) instead of single words. Phrase translation pairs are automatically learned from parallel corpora. While SMT systems generate translation output that often conveys a lot of the meaning of the original text, it is frequently ungrammatical and incoherent. The research challenge at this point is to introduce syntactic knowledge to the state of the art in order to improve translation quality. My approach breaks up the translation process along linguistic lines. I will present my thesis work on noun phrase translation and ideas about clause structure.
Speaker BiographyPhilipp Koehn is expected to receive his PhD in Computer Science from the University of Southern California in Fall 2003. He is a research assistant at the Information Sciences Institute. He worked as visiting researcher at AT&T Labs and Whizbang Labs. He published a number of papers on machine translation, lexical acquisition, machine learning and related subjects. He also gave tutorials on statistical machine translation at recent HLT/NAACL and MT Summit conferences.
October 21, 2003 | 4:30PM
Ken Church, AT&T Research Labs
AbstractCan we use the past to predict the future? Moore.s Law is a great example: performance doubles and prices halve approximately every 18 months. This trend has held up well to the test of time and is expected to continue for some time. Similar arguments can be found in speech demonstrating consistent progress over decades. Unfortunately, there are also cases where history repeats itself, as well as major dislocations, fundamental changes that invalidate fundamental assumptions. What will happen, for example, when petabytes become a commodity? Can demand keep up with supply? How much text and speech would it take to match this supply? Priorities will change. Search will become more important than coding and dictation.
Speaker BiographyI am the head of a data mining department in AT&T Labs-Research. I received my BS, Masters and PhD from MIT in computer science in 1978, 1980 and 1983, and immediately joined AT&T Bell Labs, where I have been ever since (though the name of the organization has changed). I have worked in many areas of computational linguistics including: acoustics, speech recognition, speech synthesis, OCR, phonetics, phonology, morphology, word-sense disambiguation, spelling correction, terminology, translation, lexicography, information retrieval, compression, language modeling and text analysis. I enjoy working with very large corpora such as the Associated Press newswire (1 million words per week). My datamining department is currently applying similar methods to much larger data sets such as telephone call detail (1-10 billion records per month).
October 28, 2003 | 4:30PM
Avi Pfeffer, Harvard University
AbstractUncertainty is ubiquitous in the real world, and probability provides a sound way to reason under uncertainty. This fact has led to a plethora of probabilistic representation languages such as Bayesian networks, hidden Markov models and stochastic context-free grammars. More recently, we have developed new probabilistic languages that reason at the level of object, such as object-oriented Bayesian networks and probabilistic relational models. The wide variety of languages leads to the question of whether a general purpose probabilistic modeling language can be developed that encompasses all of them. This talk will describe IBAL, an attempt at developing such a language. After presenting the IBAL language, motivating considerations for the inference algorithm will be discussed, and the mechanism for IBAL inference will be described.
November 4, 2003 | 4:30PM
Vijay Shanker, University of Deleware
AbstractInformation extraction from the Biology literature has gained considerable attention in recent years. Classifying the named entitites mentioned in the text can help the information extraction process in many ways. In this talk, I will discuss some of our recent work on named entity classification in this domain. The talk will focus on the kind of features that we believe are useful for classification purposes. Our investigation look at both name-internal features (in the form of certain informative words or suffixes found within the names) as well as name-external features in the form of words in context and the use of some simple syntactic constructions. We will conclude with some remarks on how named entity classification helps our project on extraction of phosphorylation relation from the Biology literature.
November 11, 2003 | 4:30PM
David Myers, Hope College
AbstractDavid Myers will describe progress, in Europe and in west Michigan, toward a world in which hearing aids serve not only as sophisticated microphone amplifiers, but also as customized loudspeakers. He will also share his vision for how "hearing aid compatible assistive listening" could enrich the lives of America's hard of hearing people and lessen the stigma of hearing aids and hearing loss.
Speaker BiographyHope College social psychologist David Myers (davidmyers.org) is the author of scientific publications in two dozen journals, including Science, Scientific American, and the American Scientist. His science writings for college students and the lay public have also appeared in many magazines and in 15 books, including A Quiet World: Living with Hearing Loss (Yale University Press, 2000). His advocacy for a revolution in American assistive listening is explained at hearingloop.org.
November 28, 2003 | 4:30PM
Susan Dumais, Microsoft Research