Abstract: Advanced AI systems are being deployed for more and more complex tasks. To ensure reliable human oversight over AIs, we need supervision protocols that remain effective despite the increase in task complexity and model[...]
Abstract: The widespread adoption of large language models (LLMs) places a responsibility on the AI research community to rigorously study and understand them. In this talk, I will describe my group’s research on analyzing LLMs’[...]