How Geometric Should Our Semantic Models Be?
Katrin Erk, University of Texas
September 4, 2012
Vector space models represent the meaning of a word through the contexts in which it has been observed. Each word becomes a point in a high-dimensional space in which the dimensions stand for observed context items. One advantage of these models is that they can be acquired from corpora in an unsupervised fashion. Another advantage is that they can represent word meaning in context flexibly and without recourse to dictionary senses: Each occurrence gets its own point in space; the points for different occurrences may cluster into senses, but they do not have to. Recently, there have been a number of approaches aiming to extend the vector space success story from word representations to the representation of whole sentences. However, they have a lot of technical challenges to meet (apart from the open question of whether all semantics tasks can be reduced to similarity judgments). An alternative is to combine the depth and rigor of logical form with the flexibility of vector space approaches.
Katrin Erk is an associate professor in the Department of Linguistics at the University of Texas at Austin. She completed her dissertation on tree description languages and ellipsis at Saarland University in 2002. From 2002 to 2006, she held a researcher position at Saarland University, working on manual and automatic frame-semantic analysis. Her current research focuses on computational models for word meaning and the automatic acquisition of lexical information from text corpora.