Tal Linzen (JHU) “Structure-Sensitive Dependency Learning in Recurrent Neural Networks”

When:
September 22, 2017 @ 12:00 pm – 1:15 pm
2017-09-22T12:00:00-04:00
2017-09-22T13:15:00-04:00
Where:
Hackerman Hall B17
3400 N Charles St
Baltimore, MD 21218
USA
Cost:
Free
Contact:
Center for Language and Speech Processing

Abstract

Neural networks have recently become ubiquitous in natural language processing systems, but we typically have little understanding of specific capabilities of these networks beyond their overall accuracy in an applied task. The present work investigates the ability of recurrent neural networks (RNNs), which are not equipped with explicit syntactic representations, to learn structure-sensitive dependencies from a natural corpus; we use English subject-verb number agreement as our test case.

We examine the success of the RNNs (in particular LSTMs) in predicting whether an upcoming English verb should be plural or singular. We focus on specific sentence types that are indicative of the network’s syntactic abilities; our tests use both naturally occurring sentences and constructed sentences from the experimental psycholinguistics literature. We analyze the internal representations of the network to explore the sources of its ability (or inability) to approximate sentence structure. Finally, we compare the errors made by the RNNs to agreement attraction errors made by humans.

RNNs were able to approximate certain aspects of syntactic structure very well, but only in common sentence types and only when trained specifically to predict the number of a verb (as opposed to a standard language modeling objective). In complex sentences their performance degraded substantially; they made many more errors than human participants. These results suggest that stronger inductive biases are likely to be necessary to eliminate errors altogether; we begin to investigate to what extent these biases can arise from multi-task learning. More broadly, our work illustrates how methods from linguistics and psycholinguistics can help us understand the abilities and limitations of “black-box” neural network models.

Biography

Tal Linzen is an Assistant Professor of Cognitive Science at Johns Hopkins University. Before moving to Johns Hopkins, he was a postdoctoral researcher at the École Normale Supérieure in Paris, where he worked with Emmanuel Dupoux and Benjamin Spector, and was affiliated with the Laboratoire de Sciences Cognitives et Psycholinguistique and Institut Jean Nicod. Dr. Linzen obtained his PhD from the Department of Linguistics at New York University in 2015, under the supervision of Alec Marantz. His interests are in developing and testing cognitive models of human language; particular problems he has worked on are probabilistic prediction in language comprehension, generalization in language learning and the linguistic capacities of artificial neural networks.

Center for Language and Speech Processing