Relational Learning for Natural Language Parsing and Information Extraction

Raymond J. Mooney, The University of Texas at Austin

April 14, 1998


Abstract

We are exploring the application of relational learning methods, such as inductive logic programming, to the construction of natural language processing systems. We have developed a system, CHILL, for learning a deterministic parser from a corpus of parsed sentences. CHILL can construct complete natural-language interfaces that translate database queries directly into executable logical form. It has been tested on English queries for a small database on U.S. geography, answering queries more accurately than a previous hand-built system. It has also recently been tested on Spanish, Turkish, and Japanese queries for the same database, and English queries about jobs posted to the newsgroup misc.jobs.offered and queries about restaurants in the California Bay Area. We are also developing a system for inducing pattern-match rules for extracting information from natural-language texts. This system has obtained promising initial results on extracting information from postings to misc.jobs.offered in order to assemble a database of available jobs. Our overall goal is to combine these techniques to automate the development of natural language systems that can answer queries about information available in a body of texts, such as newsgroup postings or web pages.