What does “bwick” have that “bnick” does not? – Adam Albright (MIT)

January 31, 2006 all-day

Native speakers of English generally agree that although “blick” is not actually a word, it is quite plausible as one; “bwick”, on the other hand, is somewhat odd, and “bnick” is impossible. What computations are used to make these assessments, and what types of knowledge do they rely on? In this talk I contrast two classes of models of gradient wordlikeness: exemplar-based/lazy learning models, in which novel words are compared to the lexicon of existing words, vs. sequence-based/n-gram type models, in which the likelihood of novel words depends on the probability of their subparts. It is often assumed that an exemplar approach is necessary in modeling gradient wordlikeness, since both [bw] and [bn] are non-occurring zero probability sequences, yet [bw] is more similar to existing [br], [bl], etc. I present a sequence-based model that overcomes this difficulty by considering natural classes sets of sounds that share phonological feature values. I show that when we compare the predictions of these models against human judgments, a sequential model that uses natural classes outperforms nearest-neighbor/exemplar-based models in a variety of important respects.

Adam Albright received his Ph.D. in linguistics from UCLA in 2002. He was a Faculty Fellow at UC Santa Cruz from 2002-2004, and is currently an Assistant Professor at MIT. His research interests include phonology, morphology, and learnability, with an emphasis on using computational modeling and experimental techniques to investigate issues in phonological theory.

Center for Language and Speech Processing