Seminars

Feb
3
Fri
Sasha Rush (Cornell University) “Pretraining Without Attention” @ Hackerman Hall B17
Feb 3 @ 12:00 pm – 1:15 pm

Abstract

Transformers are essential to pretraining. As we approach 5 years of BERT, the connection between attention as architecture and transfer learning remains key to this central thread in NLP. Other architectures such as CNNs and RNNs have been used to replicate pretraining results, but these either fail to reach the same accuracy or require supplemental attention layers. This work revisits the semanal BERT result and considers pretraining without attention. We consider replacing self-attention layers with recently developed approach for long-range sequence modeling and transformer architecture variants. Specifically, inspired by recent papers like the structured space space sequence model (S4), we use simple routing layers based on state-space models (SSM) and a bidirectional model architecture based on multiplicative gating. We discuss the results of the proposed Bidirectional Gated SSM (BiGS) and present a range of analysis into its properties. Results show that architecture does seem to have a notable impact on downstream performance and a different inductive bias that is worth exploring further.

Biography

Alexander “Sasha” Rush is an Associate Professor at Cornell Tech. His work is at the intersection of natural language processing and generative modeling with applications in text generation, efficient inference, and controllability. He has written several popular open-source software projects supporting NLP research and data science, and works part-time as a researcher at Hugging Face. He is the secretary of ICLR and developed software used to run virtual conferences during COVID. His work has received paper and demo awards at major NLP, visualization, and hardware conferences, an NSF Career Award, and a Sloan Fellowship. He tweets and blogs, mostly about coding and ML, at @srush_nlp.
Feb
6
Mon
Sharon Levy (University of California, Santa Barbara) “Responsible AI via Responsible Large Language Models” @ Hackerman Hall B17
Feb 6 @ 12:00 pm – 1:15 pm

Abstract

While large language models have advanced the state-of-the-art in natural language processing, these models are trained on large-scale datasets, which may include harmful information. Studies have shown that as a result, the models exhibit social biases and generate misinformation after training. In this talk, I will discuss my work on analyzing and interpreting the risks of large language models across the areas of fairness, trustworthiness, and safety. I will first describe my research in the detection of dialect bias between African American English (AAE) vs. Standard American English (SAE). The second part investigates the trustworthiness of models through the memorization and subsequent generation of conspiracy theories. I will end my talk with recent work in AI safety regarding text that may lead to physical harm.

Biography

Sharon is a 5th-year Ph.D. candidate at the University of California, Santa Barbara, where she is advised by Professor William Wang. Her research interests lie in natural language processing, with a focus on Responsible AI. Sharon’s research spans the subareas of fairness, trustworthiness, and safety, with publications in ACL, EMNLP, WWW, and LREC. She has spent summers interning at AWS, Meta, and Pinterest. Sharon is a 2022 EECS Rising Star and a current recipient of the Amazon Alexa AI Fellowship for Responsible AI.

Feb
10
Fri
Mark Yatskar (University of Pennsylvania) @ Hackerman Hall B17
Feb 10 @ 12:00 pm – 1:15 pm
Feb
20
Mon
Hanjie Chen (University of Virginia) “Bridging Humans and Machines: Techniques for Trustworthy NLP” @ Hackerman Hall B17
Feb 20 @ 12:00 pm – 1:15 pm

Abstract

Advanced neural language models have grown ever larger and more complex, pushing forward the limits of language understanding and generation, while diminishing interpretability. The black-box nature of deep neural networks blocks humans from understanding them, as well as trusting and using them in real-world applications. This talk will introduce interpretation techniques that bridge the gap between humans and models for developing trustworthy natural language processing(NLP). I will first show how to explain black-box models and evaluate their explanations for understanding their prediction behavior. Then I will introduce how to improve the interpretability of neural language models by making their decision-making transparent and rationalized. Finally, I will discuss how to diagnose and improve models (e.g., robustness) through the lens of explanations. I will conclude with future research directions that are centered around model interpretability and committed to facilitating communications and interactions between intelligent machines, system developers, and end users for long-term trustworthy AI.

Biography

Hanjie Chen is a Ph.D. candidate in Computer Science at the University of Virginia, advised by Prof. Yangfeng Ji. Her research interests lie in Trustworthy AI, Natural Language Processing (NLP), andInterpretable Machine Learning. She develops interpretation techniques to explain neural language models and make their prediction behavior transparent and reliable. She is a recipient of the Carlos and Esther Farrar Fellowship and the Best Poster Award at the ACM CAPWIC 2021. Her work has been published at top-tier NLP/AI conferences (e.g., ACL, AAAI, EMNLP, NAACL) and selected by the National Center for Women & Information Technology (NCWIT) Collegiate Award Finalist 2021. She (as the primary instructor) co-designed and taught the course, Interpretable Machine Learning, and was awarded the UVA CS Outstanding Graduate Teaching Award and University-wide Graduate Teaching Awards Nominee (top 5% of graduate instructors). More details can be found athttps://www.cs.virginia.edu/~hc9mx

Feb
24
Fri
Wei Xu (Georgia Tech) @ Hackerman Hall B17
Feb 24 @ 12:00 pm – 1:15 pm
Feb
27
Mon
Saadia Gabriel (University of Washington) “Socially Responsible and Factual Reasoning for Equitable AI Systems” @ Hackerman Hall B17
Feb 27 @ 12:00 pm – 1:15 pm

Abstract

Understanding the implications underlying a text is critical to assessing its impact, in particular the social dynamics that may result from a reading of the text. This requires endowing artificial intelligence (AI) systems with pragmatic reasoning, for example to correctly conclude that the statement “Epidemics and cases of disease in the 21st century are “staged”” relates to unfounded conspiracy theories. In this talk, I discuss how shortcomings in the ability of current AI systems to reason about pragmatics present challenges to equitable detection of false or harmful language. I demonstrate how these shortcomings can be addressed by imposing human-interpretable structure on deep learning architectures using insights from linguistics.In the first part of the talk, I describe how adversarial text generation algorithms can be used to improve robustness of content moderation systems. I then introduce a pragmatic formalism for reasoning about harmful implications conveyed by social media text. I show how this pragmatic approach can be combined with generative neural language models to uncover implications of news headlines. I also address the bottleneck to progress in text generation posed by gaps in evaluation of factuality. I conclude by showing how context-aware content moderation can be used to ensure safe interactions with conversational agents.

 

Biography

Saadia Gabriel is a PhD candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, advised by Prof. Yejin Choi and Prof. Franziska Roesner. Her researchrevolves around natural language processing and machine learning, with a particular focus on building systems for understanding how social commonsense manifests in text (i.e. how do people typically behave in social scenarios), as well as mitigating spread of false or harmful text (e.g. Covid-19 misinformation). Her work has been covered by a wide range of media outlets like Forbes and TechCrunch. It has also received a 2019 ACL best short paper nomination, a 2019 IROS RoboCup best paper nomination and won a best paper award at the 2020 WeCNLP summit. Prior to her PhD, Saadia received a BA summa cum laude from Mount Holyoke College in Computer Science and Mathematics.

 

Center for Language and Speech Processing