Entity disambiguation is the problem of determining whether two mentions of entities refer to the same object: e.g., trying to decide whether the entity called “Jim Clark” in one document is the same as the entity called “Jim Clark” in another document. To do this accurately, it is necessary to extract from these documents descriptions of these entities as exhaustive and accurate as possible. This in turn requires ‘tracking’ these entities in each document – identifying all or most of their mentions – and collecting their properties, particularily those that help the most to discriminate between individuals.
The goal of the workshop is to further the state of the art in entity disambiguation by developing better techniques for tracking entities and for extracting their properties. A particular focus will be improving entity tracking by using lexical and encyclopedic knowledge extracted both from structured lexical databases and from semi-strcutured repositories such as Wikipedia. Lack of such knowledge is one of the main problems with current entity tracking methods, which typically cannot detect that ‘the Packwood proposal’ and ‘the Packwood plan’ in the following example refer to the same entity.
Methods to be used include text mining techniques (supervised and unsupervised) to extract object properties; better machine learning techniques to improve entity tracking (e.g., using tree kernels); methods for extracting knowledge from WordNet, semantic role labellers, and Wikipedia; and clustering methods for entity disambiguation.
Entity Disambiguation Scoring Metrics
SVMs and Kernels
Versley System – PDF
Team Members | |
Senior Members | |
Ron Artstein | University of Essex |
David Day | MITRE |
Jason Duncan | Department of Defense |
Alessandro Moschitti | University of Trento |
Massimo Poesio | Unversity of Essex and University of Trento |
Xiaofeng Yang | Institute for Infocomm Research, Singapore |
Graduate Students | |
Jason Smith | CLSP |
Robert Hall | University of Massachussetts |
Simone Ponzetto | EML Research |
Yannick Versley | University of Tubingen |
Michael Wick | University of Massachusetts |
Undergraduate Students | |
Vladimir Eidelman | Columbia University |
Alan Jern | University of California Los Angeles |
Brett Shwom | New York University |
Affiliate Members | |
Walter Daelmans | University of Antwerp |
Claudio Giuliano | FBK-IRST |
Janet Hitzeman | MITRE |
Veronique Hoste | University of Antwerp |
Emily Jamison | Ohio |
Mijail Kabadjov | Edinburgh University |
Gideon Mann | University of Massachusetts |
Sameer Pradhan | BBN |
Michael Strube | EML Research |