While large language models have advanced the state-of-the-art in natural language processing, these models are trained on large-scale datasets, which may include harmful information. Studies have shown that as a result, the models exhibit social biases and generate misinformation after training. In this talk, I will discuss my work on analyzing and interpreting the risks of large language models across the areas of fairness, trustworthiness, and safety. I will first describe my research in the detection of dialect bias between African American English (AAE) vs. Standard American English (SAE). The second part investigates the trustworthiness of models through the memorization and subsequent generation of conspiracy theories. I will end my talk with recent work in AI safety regarding text that may lead to physical harm.
Sharon is a 5th-year Ph.D. candidate at the University of California, Santa Barbara, where she is advised by Professor William Wang. Her research interests lie in natural language processing, with a focus on Responsible AI. Sharon’s research spans the subareas of fairness, trustworthiness, and safety, with publications in ACL, EMNLP, WWW, and LREC. She has spent summers interning at AWS, Meta, and Pinterest. Sharon is a 2022 EECS Rising Star and a current recipient of the Amazon Alexa AI Fellowship for Responsible AI.
Large language models (LLMs) have demonstrated incredible power, but they also possess vulnerabilities that can lead to misuse and potential attacks. In this presentation, we will address two fundamental questions regarding the responsible utilization of LLMs: (1) How can we accurately identify AI-generated text? (2) What measures can safeguard the intellectual property of LLMs? We will introduce two recent watermarking techniques designed for text and models, respectively. Our discussion will encompass the theoretical underpinnings that ensure the correctness of watermark detection, along with robustness against evasion attacks. Furthermore, we will showcase empirical evidence validating their effectiveness. These findings establish a solid technical groundwork for policymakers, legal professionals, and generative AI practitioners alike.
Lei Li is an Assistant Professor in Language Technology Institute at Carnegie Mellon University. He received Ph.D. from Carnegie Mellon University School of Computer Science. He is a recipient of ACL 2021 Best Paper Award, CCF Young Elite Award in 2019, CCF distinguished speaker in 2017, Wu Wen-tsün AI prize in 2017, and 2012 ACM SIGKDD dissertation award (runner-up), and is recognized as Notable Area Chair of ICLR 2023. Previously, he was a faculty member at UC Santa Barbara. Prior to that, he founded ByteDance AI Lab in 2016 and led its research in NLP, ML, Robotics, and Drug Discovery. He launched ByteDance’s machine translation system VolcTrans and AI writing system Xiaomingbot, serving one billion users.