Seminars

Oct
2
Mon
CLSP Student Seminar – Anna Favaro @ Hackerman Hall B17
Oct 2 @ 12:00 pm – 1:15 pm
Oct
6
Fri
CLSP Student Seminar – Andrew Blair-Stanek “Shelter Check and GPT-4’s Bad Legal Performance” @ Hackerman Hall B17
Oct 6 @ 12:00 pm – 1:15 pm

Abstract

Our goal is to use AI to automatically find tax minimization strategies, an approach which we call “Shelter Check.” It would come in two variants. Existing-Authority Shelter Check would aim to find whether existing tax law authorities can be combined to create tax minimization strategies, so the IRS or Congress can shut them down. New-Authority Shelter Check would automate checking whether a new tax law authority – like proposed legislation or a draft court decision – would combine with existing authorities to create a new tax minimization strategy. We had initially had high hopes for GPT-* large language models for implementing Shelter Check, but our tests have showed that they do very poorly at basic legal reasoning and handling legal text. So we are now creating a benchmark and training data for LLM’s handling legal text, hoping to spur improvements.

Oct
9
Mon
Wei-Ning Hsu (Meta Foundational AI Research) “Large Scale Universal Speech Generative Models” @ Hackerman Hall B17
Oct 9 @ 12:00 pm – 1:15 pm

Abstract

Large-scale generative models such as GPT and DALL-E have revolutionized natural language processing and computer vision research. These models not only generate high fidelity text or image outputs, but also demonstrate impressive domain and task generalization capabilities. In contrast, audio generative models are relatively primitive in scale and generalization.

In this talk, I will start with a brief introduction on conventional neural speech generative models and discuss why they are unfit for scaling to Internet-scale data. Next, by reviewing the latest large-scale generative models for text and image, I will outline a few lines of promising approaches to build scalable speech models. Last, I will present Voicebox, our latest work to advance this area. Voicebox is the most versatile generative model for speech. It is trained with a simple task — text conditioned speech infilling — on over 50K hours of multilingual speech with a powerful flow-matching objective. Through in-context learning, Voicebox can perform monolingual/cross-lingual zero-shot TTS, holistic style conversion, transient noise removal, content editing, and diverse sample generation. Moreover, Voicebox achieves state-of-the-art performance and excellent run-time efficiency.

Biography

Wei-Ning Hsu is a research scientist at Meta Foundational AI Research (FAIR) and currently the lead of the audio generation team. His research focuses on self-supervised learning and generative models for speech and audio. His pioneering work includes HuBERT, AV-HuBERT, TextlessNLP, data2vec, wav2vec-U, textless speech translation, and Voicebox. 

Prior to joining Meta, Wei-Ning worked at MERL and Google Brain as a research intern. He received his Ph.D. and S.M. degrees in Electrical Engineering and Computer Science from Massachusetts Institute of Technology in 2020 and 2018, under the supervision of Dr. James Glass. He received his B.S. degree in Electrical Engineering from National Taiwan University in 2014, under the supervision of Prof. Lin-shan Lee and Prof. Hsuan-Tien Lin.

Oct
13
Fri
Antoine Bosselut (EPFL) “From Mechanistic Interpretability to Mechanistic Reasoning” @ Hackerman Hall B17
Oct 13 @ 12:00 pm – 1:15 pm

Abstract

Pretrained language models (LMs) encode implicit representations of knowledge in their parameters. Despite this observation, our best methods for interpreting these representations yield few actionable insights on how to manipulate this parameter space for downstream benefit. In this talk, I will present work on methods that simulate machine reasoning by localizing and modifying parametric knowledge representations. First, I will present a method for discovering knowledge-critical subnetworks within pretrained language models, and show that these sparse computational subgraphs are responsible for the model’s ability to encode specific pieces of knowledge. Then, I will present a new reasoning algorithm, RECKONING, a bi-level optimisation procedure that dynamically encodes and reasons over new knowledge at test-time using the model’s existing learned knowledge representations as a scratchpad. Finally, I will discuss next steps and challenges in using internal model mechanisms for reasoning.
Bio
Antoine Bosselut is an assistant professor in the School of Computer and Communication Sciences at the École Polytechnique Fédéral de Lausanne (EPFL). He was a postdoctoral scholar at Stanford University and a Young Investigator at the Allen Institute for AI (AI2). He completed his PhD at the University of Washington and was a student researcher at Microsoft Research. His research interests are in building systems that mix knowledge and language representations to solve problems in NLP, specializing in commonsense representation and reasoning.
Oct
16
Mon
CLSP Student Seminar – Maliha Jahan
Oct 16 @ 12:00 pm – 1:15 pm
Oct
23
Mon
CLSP Student Seminar – David Mueller @ Hackerman Hall B17
Oct 23 @ 12:00 pm – 1:15 pm
Oct
27
Fri
Sharon Goldwater (University of Edinburgh) “Analyzing Representations of Self-Supervised Speech Models” @ Hackerman Hall B17
Oct 27 @ 12:00 pm – 1:15 pm

Abstract

Recent advances in speech technology make heavy use of pre-trained models that learn from large quantities of raw (untranscribed) speech, using “self-supervised” (ie unsupervised) learning. These models learn to transform the acoustic input into a different representational format that makes supervised learning (for tasks such as transcription or even translation) much easier. However, *what* and *how* speech-relevant information is encoded in these representations is not well understood. I will talk about some work at various stages of completion in which my group is analyzing the structure of these representations, to gain a more systematic understanding of how word-level, phonetic, and speaker information is encoded.

Biography

Sharon Goldwater is a Professor in the Institute for Language, Cognition and Computation at the University of Edinburgh’s School of Informatics. She received her PhD in 2007 from Brown University and spent two years as a postdoctoral researcher at Stanford University before moving to Edinburgh. Her research interests include unsupervised and minimally-supervised learning for speech and language processing, computer modelling of language acquisition in children, and computational studies of language use.  Her main focus within linguistics has been on the lower levels of structure including phonetics, phonology, and morphology.Prof. Goldwater has received awards including the 2016 Roger Needham Award from the British Computer Society for “distinguished research contribution in computer science by a UK-based researcher who has completed up to 10 years of post-doctoral research.” She has served on the editorial boards of several journals, including Computational Linguistics, Transactions of the Association for Computational Linguistics, and the inaugural board of OPEN MIND: Advances in Cognitive Science. She was a program chair for the EACL 2014 Conference and chaired the EACL governing board from 2019-2020.

Center for Language and Speech Processing