Semantic Lexicons and Semantic Tagging: towards content interoperability – Nicoletta Calzolari (Istituto Di Linguistica Computazionale)

March 2, 2004 all-day

Large scale language resources are unanimously recognised as the necessary infrastructure underlying language technology. Discussing a few major European initiatives for building harmonised lexicons, we will highlight how computational lexicons and textual corpora should be considered as complementary views on the lexical space. A “complete” computational lexicon should incorporate our “knowledge of the world”, and represent it in an explicit and formal way. We claim that it is theoretically not possible to achieve completeness within any “static” lexicon. A sound language infrastructure must encompass both “static” lexicons, as the traditional ones, and “dynamic” systems able to enrich the lexicon with information acquired on-line from large corpora, thus capturing the “actually realised” potentialities, the large range of variation, and the flexibility inherent in the language as it is used. These are the challenges for semantic tagging. Part of the talk will point at problems arisen in different semantic annotation exercises. Broadening our perspective into the future, the need of ever growing language resources for effective content processing requires a change in the paradigm, and the design of a new generation of language resources, based on open content interoperability standards. The Semantic Web notion is going to crucially determine the shape of the language resources of the future, consistent with the vision of an open space of sharable knowledge available on the Web for processing.

Nicoletta Calzolari, graduated in Philosophy at the University of Bologna, is Director of Research at CNR, and now Director of the Istituto di Linguistica Computazionale of the CNR in Pisa, Italy. She works in the field of Computational Linguistics since 1972. Main fields of interest: computational lexicology and lexicography; text corpora; standardisation and evaluation of language resources; lexical semantics; knowledge acquisition from multiple (lexical and textual) sources, integration and representation. She has co-ordinated many international/European and national projects. Member and general secretary of ICCL, member of the ELRA Board, and of many International Committees and Advisory Boards. Conference chair of LREC’04. Invited speaker, member of program committee or organiser for quite numerous international scientific conferences, workshops, etc.

