Named Entity Classification in the Biology Domain: Investigating Some Sources of Information – Vijay Shanker (University of Deleware)

November 4, 2003 all-day

Information extraction from the Biology literature has gained considerable attention in recent years. Classifying the named entitites mentioned in the text can help the information extraction process in many ways.
In this talk, I will discuss some of our recent work on named entity classification in this domain. The talk will focus on the kind of features that we believe are useful for classification purposes. Our investigation look at both name-internal features (in the form of certain informative words or suffixes found within the names) as well as name-external features in the form of words in context and the use of some simple syntactic constructions. We will conclude with some remarks on how named entity classification helps our project on extraction of phosphorylation relation from the Biology literature.

