Seminars

Nov
11
Fri
Hui Guan (University of Massachusetts Amherst) “Towards Accurate and Efficient Edge Computing Via Multi-Task Learning” @ Hackerman Hall B17
Nov 11 @ 12:00 pm – 1:15 pm

Abstract

AI-powered applications increasingly adopt Deep Neural Networks (DNNs) for solving many prediction tasks, leading to more than one DNNs running on resource-constrained devices. Supporting many models simultaneously on a device is challenging due to the linearly increased computation, energy, and storage costs. An effective approach to address the problem is multi-task learning (MTL) where a set of tasks are learned jointly to allow some parameter sharing among tasks. MTL creates multi-task models based on common DNN architectures and has shown significantly reduced inference costs and improved generalization performance in many machine learning applications. In this talk, we will introduce our recent efforts on leveraging MTL to improve accuracy and efficiency for edge computing. The talk will introduce multi-task architecture design systems that can automatically identify resource-efficient multi-task models with low inference costs and high task accuracy.
Biography
Hui Guan is an Assistant Professor in the College of Information and Computer Sciences (CICS) at the University of Massachusetts Amherst, the flagship campus of the UMass system. She received her Ph.D. in Electrical Engineering from North Carolina State University in 2020. Her research lies in the intersection between machine learning and systems, with an emphasis on improving the speed, scalability, and reliability of machine learning through innovations in algorithms and programming systems. Her current research focuses on both algorithm and system optimizations of deep multi-task learning and graph machine learning.
Nov
18
Fri
Angela Fan (Meta AI Research) “No Language Left Behind: Scaling Human-Centered Machine Translation” @ Hackerman Hall B17
Nov 18 @ 12:00 pm – 1:15 pm

Abstract

Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today. However, such efforts have coalesced around a small subset of languages, leaving behind the vast majority of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe, high-quality results, all while keeping ethical considerations in mind? In this talk, I introduce No Language Left Behind, an initiative to break language barriers for low-resource languages. In No Language Left Behind, we took on the low-resource language translation challenge by first contextualizing the need for translation support through exploratory interviews with native speakers. Then, we created datasets and models aimed at narrowing the performance gap between low and high-resource languages. We proposed multiple architectural and training improvements to counteract overfitting while training on thousands of tasks. Critically, we evaluated the performance of over 40,000 different translation directions using a human-translated benchmark, Flores-200, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety. Our model achieves an improvement of 44% BLEU relative to the previous state-of-the-art, laying important groundwork towards realizing a universal translation system in an open-source manner.

Biography

Angela is a research scientist at Meta AI Research in New York, focusing on supporting efforts in speech and language research. Recent projects include No Language Left Behind (https://ai.facebook.com/research/no-language-left-behind/) and Universal Speech Translation for Unwritten Languages (https://ai.facebook.com/blog/ai-translation-hokkien/). Before translation, Angela previously focused on research in on-device models for NLP and computer vision and text generation.

Feb
3
Fri
Sasha Rush (Cornell University) “Pretraining Without Attention” @ Hackerman Hall B17
Feb 3 @ 12:00 pm – 1:15 pm

Abstract

Transformers are essential to pretraining. As we approach 5 years of BERT, the connection between attention as architecture and transfer learning remains key to this central thread in NLP. Other architectures such as CNNs and RNNs have been used to replicate pretraining results, but these either fail to reach the same accuracy or require supplemental attention layers. This work revisits the semanal BERT result and considers pretraining without attention. We consider replacing self-attention layers with recently developed approach for long-range sequence modeling and transformer architecture variants. Specifically, inspired by recent papers like the structured space space sequence model (S4), we use simple routing layers based on state-space models (SSM) and a bidirectional model architecture based on multiplicative gating. We discuss the results of the proposed Bidirectional Gated SSM (BiGS) and present a range of analysis into its properties. Results show that architecture does seem to have a notable impact on downstream performance and a different inductive bias that is worth exploring further.

Biography

Alexander “Sasha” Rush is an Associate Professor at Cornell Tech. His work is at the intersection of natural language processing and generative modeling with applications in text generation, efficient inference, and controllability. He has written several popular open-source software projects supporting NLP research and data science, and works part-time as a researcher at Hugging Face. He is the secretary of ICLR and developed software used to run virtual conferences during COVID. His work has received paper and demo awards at major NLP, visualization, and hardware conferences, an NSF Career Award, and a Sloan Fellowship. He tweets and blogs, mostly about coding and ML, at @srush_nlp.
Feb
6
Mon
Sharon Levy (University of California, Santa Barbara) “Responsible AI via Responsible Large Language Models” @ Hackerman Hall B17
Feb 6 @ 12:00 pm – 1:15 pm

Abstract

While large language models have advanced the state-of-the-art in natural language processing, these models are trained on large-scale datasets, which may include harmful information. Studies have shown that as a result, the models exhibit social biases and generate misinformation after training. In this talk, I will discuss my work on analyzing and interpreting the risks of large language models across the areas of fairness, trustworthiness, and safety. I will first describe my research in the detection of dialect bias between African American English (AAE) vs. Standard American English (SAE). The second part investigates the trustworthiness of models through the memorization and subsequent generation of conspiracy theories. I will end my talk with recent work in AI safety regarding text that may lead to physical harm.

Biography

Sharon is a 5th-year Ph.D. candidate at the University of California, Santa Barbara, where she is advised by Professor William Wang. Her research interests lie in natural language processing, with a focus on Responsible AI. Sharon’s research spans the subareas of fairness, trustworthiness, and safety, with publications in ACL, EMNLP, WWW, and LREC. She has spent summers interning at AWS, Meta, and Pinterest. Sharon is a 2022 EECS Rising Star and a current recipient of the Amazon Alexa AI Fellowship for Responsible AI.

Feb
10
Fri
Mark Yatskar (University of Pennsylvania) “Understanding Dataset Biases: Behavioral Indicators During Annotation and Contrastive Mitigations” @ Hackerman Hall B17
Feb 10 @ 12:00 pm – 1:15 pm

Abstract

Biases in datasets, or unintentionally introduced spurious cues, are a common source of misspecification in machine learning. Performant models trained on such data can gender stereotype or be brittle under distribution shift.  In this talk, we present several results in multimodal and question answering applications studying sources of dataset bias, and several mitigation methods.  We propose approaches where known dimensions of dataset bias are explicitly factored out of a model during learning, without needing to modify data. Finally, we ask whether dataset biases can be attributable to annotator behavior during annotation. Drawing inspiration from work in psychology on cognitive biases, we show certain behavioral patterns are highly indicative of the creation of problematic (but valid) data instances in question answering. We give evidence that many existing observations around how dataset bias propagates to models can be attributed to data samples created by annotators we identify.

Biography

Mark Yatskar is an Assistant Professor at University of Pennsylvania in the department of Computer and Information Science. He did his PhD at University of Washington co-advised by Luke Zettlemoyer and Ali Farhadi. He was a Young Investigator at the Allen Institute for Artificial Intelligence for several years working with their computer vision team, Prior. His work spans Natural Language Processing, Computer Vision, and Fairness in Machine Learning. He received a Best Paper Award at EMNLP for work on gender bias amplification, and his work has been featured in Wired and the New York Times.

Feb
24
Fri
Wei Xu (Georgia Tech) “GPT-3 vs Humans: Rethinking Evaluation of Natural Language Generation” @ Hackerman Hall B17
Feb 24 @ 12:00 pm – 1:15 pm

Abstract

While GPT models have shown impressive performance on summarization and open-ended text generation, it’s important to assess their abilities on more constrained text generation tasks that require significant and diverse rewritings. In this talk, I will discuss the challenges of evaluating systems that are highly competitive and perform close to humans on two such tasks: (i) paraphrase generation and (ii) text simplification. To address these challenges, we introduce an interactive Rank-and-Rate evaluation framework. Our results show that GPT-3.5 has made a major step up from fine-tuned T5 in paraphrase generation, but still lacks the diversity and creativity of humans who spontaneously produce large quantities of paraphrases.

Additionally, we demonstrate that GPT-3.5 performs similarly to a single human in text simplification, which makes it difficult for existing automatic evaluation metrics to distinguish between the two. To overcome this shortcoming, we propose LENS, a learnable evaluation metric that outperforms SARI, BERTScore, and other existing methods in both automatic evaluation and minimal risk decoding for text generation.

Biography

Wei Xu is an assistant professor in the School of Interactive Computing at the Georgia Institute of Technology, where she is also affiliated with the new NSF AI CARING Institute and Machine Learning Center. She received her Ph.D. in Computer Science from New York University and her B.S. and M.S. from Tsinghua University. Xu’s research interests are in natural language processing, machine learning, and social media, with a focus on text generation, stylistics, robustness and controllability of machine learning models, and reading and writing assistive technology. She is a recipient of the NSF CAREER Award, CrowdFlower AI for Everyone Award, Criteo Faculty Research Award, and Best Paper Award at COLING’18. She has also received funds from DARPA and IARPA. She is an elected member of the NAACL executive board and regularly serves as a senior area chair for AI/NLP conferences.

Feb
27
Mon
Saadia Gabriel (University of Washington) “Socially Responsible and Factual Reasoning for Equitable AI Systems” @ Hackerman Hall B17
Feb 27 @ 12:00 pm – 1:15 pm

Abstract

Understanding the implications underlying a text is critical to assessing its impact, in particular the social dynamics that may result from a reading of the text. This requires endowing artificial intelligence (AI) systems with pragmatic reasoning, for example to correctly conclude that the statement “Epidemics and cases of disease in the 21st century are “staged”” relates to unfounded conspiracy theories. In this talk, I discuss how shortcomings in the ability of current AI systems to reason about pragmatics present challenges to equitable detection of false or harmful language. I demonstrate how these shortcomings can be addressed by imposing human-interpretable structure on deep learning architectures using insights from linguistics.In the first part of the talk, I describe how adversarial text generation algorithms can be used to improve robustness of content moderation systems. I then introduce a pragmatic formalism for reasoning about harmful implications conveyed by social media text. I show how this pragmatic approach can be combined with generative neural language models to uncover implications of news headlines. I also address the bottleneck to progress in text generation posed by gaps in evaluation of factuality. I conclude by showing how context-aware content moderation can be used to ensure safe interactions with conversational agents.

 

Biography

Saadia Gabriel is a PhD candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, advised by Prof. Yejin Choi and Prof. Franziska Roesner. Her researchrevolves around natural language processing and machine learning, with a particular focus on building systems for understanding how social commonsense manifests in text (i.e. how do people typically behave in social scenarios), as well as mitigating spread of false or harmful text (e.g. Covid-19 misinformation). Her work has been covered by a wide range of media outlets like Forbes and TechCrunch. It has also received a 2019 ACL best short paper nomination, a 2019 IROS RoboCup best paper nomination and won a best paper award at the 2020 WeCNLP summit. Prior to her PhD, Saadia received a BA summa cum laude from Mount Holyoke College in Computer Science and Mathematics.

 

Mar
13
Mon
Hanjie Chen (University of Virginia) “Bridging Humans and Machines: Techniques for Trustworthy NLP” @ Hackerman Hall B17
Mar 13 @ 12:00 pm – 1:15 pm

Abstract

Advanced neural language models have grown ever larger and more complex, pushing forward the limits of language understanding and generation, while diminishing interpretability. The black-box nature of deep neural networks blocks humans from understanding them, as well as trusting and using them in real-world applications. This talk will introduce interpretation techniques that bridge the gap between humans and models for developing trustworthy natural language processing(NLP). I will first show how to explain black-box models and evaluate their explanations for understanding their prediction behavior. Then I will introduce how to improve the interpretability of neural language models by making their decision-making transparent and rationalized. Finally, I will discuss how to diagnose and improve models (e.g., robustness) through the lens of explanations. I will conclude with future research directions that are centered around model interpretability and committed to facilitating communications and interactions between intelligent machines, system developers, and end users for long-term trustworthy AI.

Biography

Hanjie Chen is a Ph.D. candidate in Computer Science at the University of Virginia, advised by Prof. Yangfeng Ji. Her research interests lie in Trustworthy AI, Natural Language Processing (NLP), andInterpretable Machine Learning. She develops interpretation techniques to explain neural language models and make their prediction behavior transparent and reliable. She is a recipient of the Carlos and Esther Farrar Fellowship and the Best Poster Award at the ACM CAPWIC 2021. Her work has been published at top-tier NLP/AI conferences (e.g., ACL, AAAI, EMNLP, NAACL) and selected by the National Center for Women & Information Technology (NCWIT) Collegiate Award Finalist 2021. She (as the primary instructor) co-designed and taught the course, Interpretable Machine Learning, and was awarded the UVA CS Outstanding Graduate Teaching Award and University-wide Graduate Teaching Awards Nominee (top 5% of graduate instructors). More details can be found athttps://www.cs.virginia.edu/~hc9mx

Nov
3
Fri
Eugenia Rho (Virginia Tech) “Words Matter: How Language Choices Predict Societal Trends and Outcomes in Media, Health and Policing” @ Hackerman Hall B17
Nov 3 @ 12:00 pm – 1:15 pm

Abstract

Effective communication lies at the heart of social harmony and individual well-being. However, key areas of our society face profound challenges in how we talk about things, or to each other. In this talk, I will show how these challenges manifest: from the manner in which TV reporters discuss current events to online health discussions in banned Reddit communities, and interactions between law enforcement and communities of color during routine car stops. My research applies theories from linguistics and psychology to analyze patterns in such dialogue using large language models (LLMs), statistics, and experimental design. In this presentation, I will introduce three research studies that highlight how specific patterns in our language choices are predictive of real-world outcomes. First, I will illustrate how partisan divides in the language of America’s two major broadcasting news stations over the past decade directly correlate with semantic polarity trends on Twitter, empirically linking for the first time how online discussions are influenced by televised media. Second, I will show how “gists” or causal statements in social media discussions about pandemic health practices unveil underlying beliefs and attitudes, which in turn, can forecast broader health trends across the U.S. Finally, by examining the linguistic interactions captured from thousands of footages from police body-worn cameras, I demonstrate how the first 45 words spoken by a police officer during a car stop with a Black driver can be quite telling about how the stop will conclude. Persistent challenges in dialogue marked by tensions and biases can have wide-ranging implications for both individuals and society. These studies call for a broader awareness on the influence of our language choices across institutional, media, and online contexts.
Bio
Eugenia Rho is an Assistant Professor of Computer Science at Virginia Tech, where she leads the SAIL (Society + AI & Language) Lab.   Her research lies at the intersection of Natural Language Processing (NLP) and Human-Computer Interaction (HCI). Her work aims to advance Computational Social Science (CSS) by using computational linguistics to better understand how AI-mediated systems impact interactions across people and machines.
Nov
6
Mon
Student Seminar – Neha Verma “Exploring Geometric Representational Disparities Between Multilingual and Bilingual Translation Models” @ Hackerman Hall B17
Nov 6 @ 12:00 pm – 1:15 pm

Abstract

Multilingual machine translation has proven immensely useful for both parameter efficiency and overall performance for many language pairs via complete parameter sharing. However, some language pairs in multilingual models can see worse performance than in bilingual models, especially in the one-to-many translation setting. Motivated by their empirical differences, we examine the geometric differences in representations from bilingual models versus those from one-to-many multilingual models. Specifically, we measure the isotropy of these representations using intrinsic dimensionality and IsoScore, in order to measure how these representations utilize the dimensions in their underlying vector space. We find that for a given language pair, its multilingual model decoder representations are consistently less isotropic than comparable bilingual model decoder representations. Additionally, we show that much of this anisotropy in multilingual decoder representations can be attributed to modeling language-specific information, therefore limiting remaining representational capacity.

Center for Language and Speech Processing