Field Methods for Natural Language Processing
Kevin Cohen, Center for Computational Pharmacology, University of Colorado, School of Medicine
August 1, 2007
Software testing is a first-class research object in computer science, but so far has not been studied in the context of natural language processing. Testing of language processing applications is qualitatively different from testing other types of applications, because language itself is qualitatively different from other classes of inputs. Nonetheless, a methodology for testing NLP applications already exists. It is theoretically isomorphic with descriptive and structural linguistics, and its praxis is isomorphic with linguistic field methods. In this talk, I present data on the state of software testing for a popular class of text mining application, show the commonalities between software testing and linguistic field methods, and illustrate a number of benefits that accrue from approaching language processing from a software testing perspective in general, and from a descriptive linguistic perspective in particular.