Optimality Theory Syntax Learnability: An Empirical Exploration of the Perceptron and GLA
Ann Irvine, Mark Dredze, Geraldine Legendre and Paul Smolensky
CogSci Workshop on OT as a General Cognitive Architecture – 2011
AbstractThis work brings together several threads of research on Optimality Theory (OT) and Harmonic Grammar (HG) learnability. As noted in previous work, including Pater (2008) and Magri (2010), the perceptron learning algorithm is well-established in the Machine Learning ﬁeld and is a natural choice for modeling human grammar acquisition. The algorithm learns from one observation at a time, and it is capable of learning from a noisy corpus of observed natural language. In this work, we use the perceptron algorithm to learn a model that speciﬁes a set of constraint weights relevant to one syntax phenomenon, Czech word order. We extract training data (sentences annotated with grammatical and information structure and their surface word orders) from the Prague Dependency Treebank (Hajic et al., 2001) and use basic alignment (edge-most) constraints on grammatical and information structure to predict the surface order of the subject, verb, and object. The perceptron algorithm learns a set of numeric, weighted constraints (a Harmonic Grammar). Ordering the constraints by the magnitude of their weights may specify a hierarchical constraint ranking (an OT Grammar), which is the essence of the classic Gradual Learning Algorithm (GLA) (Boersma, 1997). We describe and compare the two learning algorithms in detail and use a held out set of empirical data to quantitatively evaluate each. We show that by allowing for so-called ganging-up-effects, the more expressive Harmonic Grammar models Czech Word Order more accurately than the GLA OT grammar. Finally, crucially, it is also capable of modeling variation in production.
Displaying 1 - 1 of 1 total matches