Rethinking Test-Time Thinking: From Token-Level Rewards to Robust Generative Agents – Furong Huang (UMD)

Calendar

When:

March 27, 2026 @ 12:00 pm – 1:15 pm

2026-03-27T12:00:00-04:00

2026-03-27T13:15:00-04:00

Where:

Hodson 216

Cost:

Free

Seminars Student Seminars

2026 Huang March

Abstract

This talk presents a unified perspective on “test-time thinking” as a catalyst for improving generative AI agents through finer-grained reward modeling, data-centric reasoning, and robust alignment. We begin with GenARM, introducing an inductive bias for denser, token-level reward modeling that guides generation during decoding—enabling alignment without the need for retraining. Shifting to the data side of reasoning, we explore ThinkLite-VL, a self-improvement framework that leverages MCTS-guided search to select the most informative samples, yielding stronger visual reasoning with fewer labels. Pushing beyond data selection, MORSE-500 programmatically generates targeted, controllable multimodal data to systematically stress-test models’ reasoning abilities. We then critically interrogate a core assumption in inference-time alignment: Does thinking more always help? Our findings challenge the naive scaling paradigm, revealing that increased reasoning steps can actually degrade performance due to rising output variance. Finally, we introduce AegisLLM, which applies test-time thinking to security. Using an agentic, multi-perspective framework, AegisLLM defends against jailbreaks, prompt injections, and unlearning attacks entirely at inference time. Together, these works chart a path toward generative agents that are more capable, data-efficient, introspective, and secure for real-world deployment.

Bio

Furong Huang is an Associate Professor in the Department of Computer Science at the University of Maryland. Her research specializes in trustworthy machine learning, AI security, sequential decision-making, and generative AI. Dr. Huang is dedicated to translating foundational principles into practical solutions, developing machine learning algorithms that are efficient, robust, scalable, ethical, and responsible. Her pioneering contributions have been recognized with numerous accolades, including multiple Best Paper awards, the MIT Technology Review Innovators Under 35 Asia Pacific, the MLconf Industry Impact Research Award, and the NSF CRII Award. She is also the recipient of Faculty Research Awards from Microsoft, Adobe, and J.P. Morgan, and was recently named a Finalist for AI Researcher of the Year (North America) at the Women in AI Awards.

Also Available by Zoom: https://wse.zoom.us/j/96735183473

Rethinking Test-Time Thinking: From Token-Level Rewards to Robust Generative Agents – Furong Huang (UMD)

Center for Language and Speech Processing