Zero Resource Speech Technologies and Models of Early Language Acquisition

The unsupervised (zero resource) discovery of linguistic structure from speech is generating a lot of interest from two largely disjoint communities: the machine learning community is more an more interested in deploying language/speech technologies in a variety of languages/dialects with limited or no linguistic resources. The cognitive science community (psycholinguists, linguists, neurolinguists) want to understand the mechanisms by which infants spontaneously discover linguistic structure. The aim of this workshop is to bring together a team of researchers and graduate students from these two communities, to engage into mutual presentations and discussion of current and future issues. Specifically, this workshop has two aims: 1) identifying key issues and problems to be solved of interest to both communities, and 2) setting up standardized, common resources for comparing the different approaches to solving these problems (databases, evaluation criteria, software, etc). In this workshop, we will focus mainly but not exclusively on the discovery of two levels of linguistic structure: phonetic units and word-like units. We are well aware of the fact that the definition of these levels, as well as their segregation from the rest of the linguistic system is itself a matter of debate, and we welcome discussions of these issues as well.

Dates: Monday July 16 – Friday July 27

July 16th (Krieger 205): Kick-Off Symposium, Day 1

Morning 10:00a-12:30p
10:00a: Welcome and Overview of objectives [video]
10:30a: Aren Jansen (JHU/HLTCOE): Overview of Zero Resource Technology [pdf] [video]
11:30a: Dan Swingley (U. Penn): Overview of Early Language Acquisition [pdf]

Afternoon 2:30-5:30p
2:30p: Mark Johnson (Macquarie U.): Overview of Bayesian Approaches [pdf] [video]
3:30p: Bill Idsardi (U. Md): Clustering Techniques for Phonetic Categories and their Implications for Phonology [pdf]
4:30p: Emmanuel Dupoux (Ecole Normale Superieure): Modeling Language Bootstrapping: Results & Challenges [video]

July 17th (Krieger 205): Kick-Off Symposium, Day 2

Morning 9:00a-12:30p
9:00a: Naomi Feldman (U. Md): Using Bayesian Approaches to Study Human Sound and Word Learning [pdf] [video]
9:50a: Sharon Goldwater (U. Edinburgh): From Sounds to Words: Bayesian Modeling of Early Language Acquisition [pdf] [video]
10:50a: Sanjeev Khudanpur (JHU): Hybrid Dynamical System Models for Signal Segmentation and Labeling [video]
11:40a: Aren Jansen (JHU/HLTCOE): Towards a Speaker Invariant Representation of Speech [pdf] [video]

Afternoon 2:00p-5:30p
2:00p: Ian McGraw (MIT): Learning the Lexicon: A Pronunciation Mixture Model [pdf] [video]
2:50p: Rick Rose (McGill): Combining Low and High Resource Acoustic Modeling in Spoken Term Detection [pdf] [video]
3:50p: Hynek Hermansky (JHU): Dealing with Previously Unseen Unknowns in the Recognition of Speech [pdf] [video]
4:40p: Ken Church (IBM): OOVs, Pseudo-truth, and Zero Resource Methods [pdf] [video]

For select symposium abstracts, click here.

July 18th (NEB 225): Organization: Data, metrics, subprojects/subteams

July 19th-26th (NEB 225): Collaboration period, informal talks

July 27th (Hackerman B17): Final Presentation at 2:00p [pdf] [video1] [video2]

Informal Presentations

Benjamin Borschinger: Particle Filtering for Word Segmentation [pdf1] [pdf2]

Mark Johnson: Bayesian Methods Tutorial [pdf]

Hynek Hermansky: Acoustic Processing Tutorial [pdf]

Shinji Watanabe: Integrated Bayesian Unsupervised Acoustic, Lexical, and Language Models [pdf]

Pascal Clark: Rythmic Demodulation for Zero-Resource Speech Recognition [pdf]

Jason Eisner: Discovering Morphological Paradigms from Plain Text Using a Dirichlet Process Mixture Model [pdf]


Team Members 
Senior Members
Ken ChurchIBM Research
Pascal ClarkJohns Hopkins University, HLTCOE
Emmanuel DupouxEcole Normale Superieure
Naomi FeldmanUniversity of Maryland
Sharon GoldwaterUniversity of Edinburgh
Hynek HermanskyJohns Hopkins University
Aren JansenJohns Hopkins University, HLTCOE
Mark JohnsonMacquarie University
Sanjeev KhudanpurJohns Hopkins University
Ian McGrawMassachusetts Institute of Technology
Richard RoseMcGill University
Graduate Students
Erin BennettUniversity of Maryland
Benjamin BörschingerMacquarie University
Justin ChiuCarnegie Mellon University
Ewan DunbarUniversity of Maryland
Abdallah FourtassiEcole Normale Superieure
David HarwathMassachusetts Institute of Technology
Keith LevinJohns Hopkins University
Atta NorouzianMcGill University
Vijay PeddintiJohns Hopkins University
Rachel RichardsonUniversity of Maryland
Thomas SchatzEcole Normale Superieure
Yuriy ShamesJohns Hopkins University
Samuel ThomasJohns Hopkins University
Affiliate Members
Jason EisnerJohns Hopkins University
Steven GreenbergTransparent Language
Timothy (TJ) HazenMIT Lincoln Laboratory
Florian MetzeCarnegie Mellon University
Mike SeltzerMicrosoft Research
Dan SwingleyUniversity of Pennsylvania
Balakrishnan VaradarajanGoogle
Shinji WatanabeMitsubishi Electronics Research Lab

Center for Language and Speech Processing