BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-21277@www.clsp.jhu.edu DTSTAMP:20240328T185933Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nAs humans\, our understand
ing of language is grounded in a rich mental model about “how the world wo
rks” – that we learn through perception and interaction. We use this under
standing to reason beyond what we literally observe or read\, imagining ho
w situations might unfold in the world. Machines today struggle at this ki
nd of reasoning\, which limits how they can communicate with humans.
In my talk\, I will discuss th
ree lines of work to bridge this gap between machines and humans. I will f
irst discuss how we might measure grounded understanding. I will introduce
a suite of approaches for constructing benchmarks\, using machines in the
loop to filter out spurious biases. Next\, I will introduce PIGLeT: a mod
el that learns physical commonsense understanding by interacting with the
world through simulation\, using this knowledge to ground language. From a
n English-language description of an event\, PIGLeT can anticipate how the
world state might change – outperforming text-only models that are orders
of magnitude larger. Finally\, I will introduce MERLOT\, which learns abo
ut situations in the world by watching millions of YouTube videos with tra
nscribed speech. Through training objectives inspired by the developmental
psychology idea of multimodal reentry\, MERLOT learns to fuse language\,
vision\, and sound together into powerful representations. Together\, these directions suggest a pa
th forward for building machines that learn language rooted in the world.<
/p>\n
Biography
\nRowan Zellers is a final year P hD candidate at the University of Washington in Computer Science & Enginee ring\, advised by Yejin Choi and Ali Farhadi. His research focuses on enab ling machines to understand language\, vision\, sound\, and the world beyo nd these modalities. He has been recognized through an NSF Graduate Fellow ship and a NeurIPS 2021 outstanding paper award. His work has appeared in several media outlets\, including Wired\, the Washington Post\, and the Ne w York Times. In the past\, he graduated from Harvey Mudd College with a B .S. in Computer Science & Mathematics\, and has interned at the Allen Inst itute for AI.
DTSTART;TZID=America/New_York:20220214T120000 DTEND;TZID=America/New_York:20220214T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rowan Zellers (University of Washington) ” Grounding Language by Se eing\, Hearing\, and Interacting” URL:https://www.clsp.jhu.edu/events/rowan-zellers-university-of-washington- grounding-language-by-seeing-hearing-and-interacting/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Zellers END:VEVENT BEGIN:VEVENT UID:ai1ec-24465@www.clsp.jhu.edu DTSTAMP:20240328T185933Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nLarge Language Models (LLM s) have demonstrated remarkable capabilities across various domains. Howev er\, it is still very challenging to build highly-reliable applications wi th LLMs that support specialized use cases. LLMs trained on web data often excel at capturing general language patterns\, but they could struggle to support specialized domains and personalized user needs. Moreover\, LLMs can produce errors that are deceptively plausible\, making them potentiall y dangerous for high-trust scenarios. In this talk\, I will discuss some o f our recent efforts in addressing these challenges with data-efficient tu ning methods and a novel factuality evaluation framework. Specifically\, m y talk will focus on building multilingual applications\, one crucial use case often characterized by limited tuning and evaluation data.
\nBio
Xinyi(Cindy) Wang is a research scientist at Go ogle DeepMind working on Large Language Models(LLM) and its application to generative question-answering. She has worked on multilingual instruction -tuning for Gemini and multilingual generative models used in Google searc h. Before Google DeepMind\, Cindy Wang obtained her PhD degree in Language Technologies at Carnegie Mellon University. During her PhD\, she mainly w orked on developing data-efficient natural language processing~(NLP) syste ms. She has made several contributions in data selection\, data representa tion\, and model adaptation for multilingual NLP.
DTSTART;TZID=America/New_York:20240308T120000 DTEND;TZID=America/New_York:20240308T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Cindy Wang (Google DeepMind) “Building Data-Efficient and Reliable Applications with Large Language Models” URL:https://www.clsp.jhu.edu/events/cindy-wang-google-deepmind-building-dat a-efficient-and-reliable-applications-with-large-language-models/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,March\,Wang END:VEVENT END:VCALENDAR