Topic 2: AI Welfare Is Bullshit – Jay Huang – April 3, 2026
Abstract
Recent proposals urge AI labs to prepare for “AI welfare” under uncertainty about whether AI systems have morally relevant inner states. We do not argue for or against the possibility of AI welfare. We argue that current welfare indicators lack a credible path to truth-tracking and therefore should not be used as binding gates for oversight, release, or accountability. AI welfare faces two structural problems absent from other evaluation targets. First, the system and its welfare indicators are co-engineered: development decisions that shape behavior also determine welfare scores. Second, no external validation channel exists: no deployment failure or independent test can reveal whether a welfare metric measures anything real. Welfare scores can therefore be manufactured or suppressed by design with no reality check available. We trace how welfare framings create gates on routine ML decisions and rhetorical shields against accountability, and propose that restrictions on AI systems be justified by externally verifiable harms rather than welfare scores.
Bio
Jen-Tse (Jay) Huang is a postdoctoral researcher at the Center for Language and Speech Processing (CLSP) at Johns Hopkins University, working with Mark Dredze. He received his Ph.D. in Computer Science and Engineering from the Chinese University of Hong Kong and his B.Sc. from Peking University. His research explores the alignment between human and AI, leveraging psychological, cognitive, and behavioral sciences. His work has been published in top-tier AI venues, with one oral presentation at ICLR 2024. He actively serves as an area chair for conferences including NeurIPS and ACL Rolling Review.
Also Available by Zoom: https://wse.zoom.us/j/96735183473