While large language models have advanced the state-of-the-art in natural language processing, these models are trained on large-scale datasets, which may include harmful information. Studies have shown that as a result, the models exhibit social biases and generate misinformation after training. In this talk, I will discuss my work on analyzing and interpreting the risks of large language models across the areas of fairness, trustworthiness, and safety. I will first describe my research in the detection of dialect bias between African American English (AAE) vs. Standard American English (SAE). The second part investigates the trustworthiness of models through the memorization and subsequent generation of conspiracy theories. I will end my talk with recent work in AI safety regarding text that may lead to physical harm.
Sharon is a 5th-year Ph.D. candidate at the University of California, Santa Barbara, where she is advised by Professor William Wang. Her research interests lie in natural language processing, with a focus on Responsible AI. Sharon’s research spans the subareas of fairness, trustworthiness, and safety, with publications in ACL, EMNLP, WWW, and LREC. She has spent summers interning at AWS, Meta, and Pinterest. Sharon is a 2022 EECS Rising Star and a current recipient of the Amazon Alexa AI Fellowship for Responsible AI.
Automated analysis of student writing has the potential to provide alternatives to selected-response questions such as multiple choice, and to enable teachers and instructors to assess students’ reasoning skills based on their long-form writing. Further, automated support to assess both short answers and long passages could provide students with a smoother trajectory towards mastery of written communication. Our methods focus on the specific ideas students express to support formative assessment through different kinds of feedback, which aims to scaffold their abilities to reason and communicate. In this talk I review our work in the PSU NLP lab on methods for automated assessment of different forms of student writing, from younger and older students. I will briefly illustrate highly curated datasets created in collaboration with researchers in STEM education, results from deployment of an older content analysis tool on middle school physics essays, and very preliminary results on assessment of college students’ physics lab reports. I will also present our current work on short answer assessment using a novel recurrent relation network that incorporates contrastive learning.
Becky Passonneau has been a Professor in the Department of Computer Science and Engineering at Penn State University since 2016, when she joined as the first NLP researcher. Since that time the NLP faculty has grown to include Rui Zhang and Wenpeng Yin. Becky’s research in natural language processing addresses computational pragmatics, meaning the investigation of language as a system of interactive behavior that serves a wide range of purposes. She received her PhD in Linguistics from the University of Chicago in 1985, and worked at several academic and industry research labs before joining Penn State. Her work is reported in over 140 publications in journals and refereed conference proceedings, and has been funded through 27 sponsored projects from 16 sources, including government agencies, corporate sponsors, corporate gifts, and foundations..