Bill Byrne
November 24th
4:30PM
"Hierarchical Phrase-based Translation with Weighted Finite State Transducers "
Vector-based Models of Semantic Composition
Mirella Lapata - November 03rd, 2009
University of Edinburgh
Abstract
Vector-based models of word meaning have become increasingly popular
in natural language processing and cognitive science. The appeal of
these models lies in their ability to represent meaning simply by
using distributional information under the assumption that words
occurring within similar contexts are semantically similar. Despite
their widespread use, vector-based models are typically directed at
representing words in isolation and methods for constructing
representations for phrases or sentences have received little
attention in the literature.
In this talk we propose a framework for representing the meaning of
word combinations in vector space. Central to our approach is vector
composition which we operationalize in terms of additive and
multiplicative functions. Under this framework, we introduce a wide
range of composition models which we evaluate empirically on a phrase
similarity task. We also propose a novel statistical language model
that is based on vector composition and can capture long-range
semantic dependencies.
Joint work with Jeff Mitchell
Biography
Mirella Lapata is a reader (US equivalent to associate professor) in
the School of Informatics at the University of Edinburgh. Her research
interests are in natural language processing focusing on semantic
interpretation and generation. She obtained a PhD degree in
Informatics from the University of Edinburgh in 2001 and spent two
years as faculty member at the Department of Computer Science at the
University of Sheffield. She received a B.A. degree in computer
science from the University of Athens in 1994 and an Msc degree from
Carnegie Mellon University in 1998.
Abstract
Vector-based models of word meaning have become increasingly popular
in natural language processing and cognitive science. The appeal of
these models lies in their ability to represent meaning simply by
using distributional information under the assumption that words
occurring within similar contexts are semantically similar. Despite
their widespread use, vector-based models are typically directed at
representing words in isolation and methods for constructing
representations for phrases or sentences have received little
attention in the literature.
In this talk we propose a framework for representing the meaning of
word combinations in vector space. Central to our approach is vector
composition which we operationalize in terms of additive and
multiplicative functions. Under this framework, we introduce a wide
range of composition models which we evaluate empirically on a phrase
similarity task. We also propose a novel statistical language model
that is based on vector composition and can capture long-range
semantic dependencies.
Joint work with Jeff Mitchell
Biography
Mirella Lapata is a reader (US equivalent to associate professor) in the School of Informatics at the University of Edinburgh. Her research interests are in natural language processing focusing on semantic interpretation and generation. She obtained a PhD degree in Informatics from the University of Edinburgh in 2001 and spent two years as faculty member at the Department of Computer Science at the University of Sheffield. She received a B.A. degree in computer science from the University of Athens in 1994 and an Msc degree from Carnegie Mellon University in 1998.


