Auditing Memorization, Dissecting Mechanisms, and Evaluating Behavior of Large Language Models – Robin Jia (USC)

Calendar

When:

September 26, 2025 @ 12:00 pm – 1:15 pm

2025-09-26T12:00:00-04:00

2025-09-26T13:15:00-04:00

Where:

Hackerman Hall B17

Cost:

Free

Seminars Student Seminars

Jia September 2025

Abstract:

The widespread adoption of large language models (LLMs) places a responsibility on the AI research community to rigorously study and understand them. In this talk, I will describe my group’s research on analyzing LLMs’ memorization of pre-training data, their internal mechanisms, and their downstream behavior. First, I will introduce the Hubble project, in which we have pre-trained LLMs (up to 8B parameters) on controlled pre-training corpora to understand when and how they memorize sensitive data related to copyright risks, privacy leakage, and test set contamination; we envision these models as a valuable open-source resource for scientific inquiry into LLM memorization. Next, I will describe my group’s work on understanding how language models work internally, including vignettes about how they perform arithmetic with Fourier features and how they can learn optimization subroutines for in-context learning. Finally, I will highlight a recent collaboration with USC oncologists in which we uncover LLM sycophancy issues that arise when patients ask these models for medical advice.

Bio:

Robin Jia is an Assistant Professor of Computer Science at the University of Southern California. He received his Ph.D. in Computer Science from Stanford University, where he was advised by Percy Liang. He has also spent time as a visiting researcher at Facebook AI Research, working with Luke Zettlemoyer and Douwe Kiela. He is interested broadly in natural language processing and machine learning, with a focus on scientifically understanding NLP models. Robin’s work has received best paper awards at ACL and EMNLP.

Auditing Memorization, Dissecting Mechanisms, and Evaluating Behavior of Large Language Models – Robin Jia (USC)

Center for Language and Speech Processing