BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-20716@www.clsp.jhu.edu DTSTAMP:20240329T144348Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nOver the last few years\, deep neural models have taken over the field of natural language processin g (NLP)\, brandishing great improvements on many of its sequence-level tas ks. But the end-to-end nature of these models makes it hard to figure out whether the way they represent individual words aligns with how language b uilds itself from the bottom up\, or how lexical changes in register and d omain can affect the untested aspects of such representations.
\nIn this talk\, I will present NYTWIT\, a dataset created to challenge large l anguage models at the lexical level\, tasking them with identification of processes leading to the formation of novel English words\, as well as wit h segmentation and recovery of the specific subclass of novel blends. I wi ll then present XRayEmb\, a method which alleviates the hardships of proce ssing these novelties by fitting a character-level encoder to the existing models’ subword tokenizers\; and conclude with a discussion of the drawba cks of current tokenizers’ vocabulary creation schemes.
\nBi ography
\nYuval Pinter is a Senior Lecturer in the Department of Comp uter Science at Ben-Gurion University of the Negev\, focusing on natural l anguage processing. Yuval got his PhD at t he Georgia Institute of Technology School of Interactive Computing as a Bl oomberg Data Science PhD Fellow. Before that\, he worked as a Research Eng ineer at Yahoo Labs and as a Computational Linguist at Ginger Software\, a nd obtained an MA in Linguistics and a BSc in CS and Mathematics\, both fr om Tel Aviv University. Yuval blogs (in He brew) about language matters on Dagesh Kal.
DTSTART;TZID=America/New_York:20210910T120000 DTEND;TZID=America/New_York:20210910T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD SEQUENCE:0 SUMMARY:Yuval Pinter (Ben-Gurion University – Virtual Visit) “Challenging a nd Adapting NLP Models to Lexical Phenomena” URL:https://www.clsp.jhu.edu/events/yuval-pinter/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,Pinter\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23586@www.clsp.jhu.edu DTSTAMP:20240329T144348Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20230410T120000 DTEND;TZID=America/New_York:20230410T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Ruizhe Huang URL:https://www.clsp.jhu.edu/events/student-seminar-ruizhe-huang/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Huang END:VEVENT BEGIN:VEVENT UID:ai1ec-23892@www.clsp.jhu.edu DTSTAMP:20240329T144348Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThe growing power in compu ting and AI promises a near-term future of human-machine teamwork. In this talk\, I will present my research group’s efforts in understanding the co mplex dynamics of human-machine interaction and designing intelligent mach ines aimed to assist and collaborate with people. I will focus on 1) tools for onboarding machine teammates and authoring machine assistance\, 2) me thods for detecting\, and broadly managing\, errors in collaboration\, and 3) building blocks of knowledge needed to enable ad hoc human-machine tea mwork. I will also highlight our recent work on designing assistive\, coll aborative machines to support older adults aging in place.
\nBiography
\nChien-Ming Huang is the John C. Malone Assista nt Professor in the Department of Computer Science at the Johns Hopkins Un iversity. His research focuses on designing interactive AI aimed to assist and collaborate with people. He publishes in top-tier venues in HRI\, HCI \, and robotics including Science Robotics\, HRI\, CHI\, and CSCW. His res earch has received media coverage from MIT Technology Review\, Tech Inside r\, and Science Nation. Huang completed his postdoctoral training at Yale University and received his Ph.D. in Computer Science at the University of Wisconsin–Madison. He is a recipient of the NSF CAREER award. https://www.cs.jhu.edu/~cmhuang/
DTSTART;TZID=America/New_York:20230915T120000 DTEND;TZID=America/New_York:20230915T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Chien-Ming Huang (Johns Hopkins University) “Becoming Teammates: De signing Assistive\, Collaborative Machines” URL:https://www.clsp.jhu.edu/events/chien-ming-huang-johns-hopkins-universi ty/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Huang\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-24479@www.clsp.jhu.edu DTSTAMP:20240329T144348Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nT he speech field is evolving to solve more challenging scenarios\, such as multi-channel recordings with multiple simultaneous talkers. Given the man y types of microphone setups out there\, we present the UniX-Encoder. It’s a universal encoder designed for multiple tasks\, and worked with any mic rophone array\, in both solo and multi-talker environments. Our research e nhances previous multichannel speech processing efforts in four key areas: 1) Adaptability: Contrasting traditional models constrained to certain mi crophone array configurations\, our encoder is universally compatible. 2) MultiTask Capability: Beyond the single-task focus of previous systems\, U niX-Encoder acts as a robust upstream model\, adeptly extracting features for diverse tasks including ASR and speaker recognition. 3) Self-Supervise d Training: The encoder is trained without requiring labeled multi-channel data. 4) End-to-End Integration: In contrast to models that first beamfor m then process single-channels\, our encoder offers an end-to-end solution \, bypassing explicit beamforming or separation. To validate its effective ness\, we tested the UniXEncoder on a synthetic multi-channel dataset from the LibriSpeech corpus. Across tasks like speech recognition and speaker diarization\, our encoder consistently outperformed combinations like the WavLM model with the BeamformIt frontend.
DTSTART;TZID=America/New_York:20240311T200500 DTEND;TZID=America/New_York:20240311T210500 SEQUENCE:0 SUMMARY:Zili Huang (JHU) “Unix-Encoder: A Universal X-Channel Speech Encode r for Ad-Hoc Microphone Array Speech Processing” URL:https://www.clsp.jhu.edu/events/zili-huang-jhu-unix-encoder-a-universal -x-channel-speech-encoder-for-ad-hoc-microphone-array-speech-processing/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,Huang\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-24491@www.clsp.jhu.edu DTSTAMP:20240329T144348Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20240401T120000 DTEND;TZID=America/New_York:20240401T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Yuan Gong URL:https://www.clsp.jhu.edu/events/yuan-gong/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,April\,Gong END:VEVENT BEGIN:VEVENT UID:ai1ec-24507@www.clsp.jhu.edu DTSTAMP:20240329T144348Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nHistory repeats itself\, s ometimes in a bad way. Preventing natural or man-made disasters requires b eing aware of these patterns and taking pre-emptive action to address and reduce them\, or ideally\, eliminate them. Emerging events\, such as the C OVID pandemic and the Ukraine Crisis\, require a time-sensitive comprehens ive understanding of the situation to allow for appropriate decision-makin g and effective action response. Automated generation of situation reports can significantly reduce the time\, effort\, and cost for domain experts when preparing their official human-curated reports. However\, AI research toward this goal has been very limited\, and no successful trials have ye t been conducted to automate such report generation and “what-if” disaster forecasting. Pre-existing natural language processing and information ret rieval techniques are insufficient to identify\, locate\, and summarize im portant information\, and lack detailed\, structured\, and strategic aware ness. In this talk I will present SmartBook\, a novel framework that canno t be solved by large language models alone\, to consume large volumes of m ultimodal multilingual news data and produce a structured situation report with multiple hypotheses (claims) summarized and grounded with rich links to factual evidence through multimodal knowledge extraction\, claim detec tion\, fact checking\, misinformation detection and factual error correcti on. Furthermore\, SmartBook can also serve as a novel news event simulator \, or an intelligent prophetess. Given “What-if” conditions and dimension s elicited from a domain expert user concerning a disaster scenario\, Smar tBook will induce schemas from historical events\, and automatically gener ate a complex event graph along with a timeline of news articles that desc ribe new simulated events and character-centric stories based on a new Λ-s haped attention mask that can generate text with infinite length. By effec tively simulating disaster scenarios in both event graph and natural langu age format\, we expect SmartBook will greatly assist humanitarian workers and policymakers to exercise reality checks\, and thus better prevent and respond to future disasters.
\nBio
\nHeng Ji is a professor at Computer Science Department\, and an affiliated faculty member at Electrical and Computer Engineering Department and Coordinated S cience Laboratory of University of Illinois Urbana-Champaign. She is an Am azon Scholar. She is the Founding Director of Amazon-Illinois Center on AI for Interactive Conversational Experiences (AICE). She received her B.A. and M. A. in Computational Linguistics from Tsinghua University\, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing\, especially on Multimedia Multilingual Information Extraction\, Knowledge-enhanced Large Language Mo dels\, Knowledge-driven Generation and Conversational AI. She was selected as a Young Scientist to attend the 6th World Laureates Association Forum\ , and selected to participate in DARPA AI Forward in 2023. She was selecte d as “Young Scientist” and a member of the Global Future Council on the Fu ture of Computing by the World Economic Forum in 2016 and 2017. The awards she received include Women Leaders of Conversational AI (Class of 2023) b y Project Voice\, “AI’s 10 to Watch” Award by IEEE Intelligent Systems in 2013\, NSF CAREER award in 2009\, PACLIC2012 Best paper runner-up\, “Best of ICDM2013” paper award\, “Best of SDM2013” paper award\, ACL2018 Best De mo paper nomination\, ACL2020 Best Demo Paper Award\, NAACL2021 Best Demo Paper Award\, Google Research Award in 2009 and 2014\, IBM Watson Faculty Award in 2012 and 2014 and Bosch Research Award in 2014-2018. She was invi ted to testify to the U.S. House Cybersecurity\, Data Analytics\, & IT Com mittee as an AI expert in 2023. She was invited by the Secretary of the U. S. Air Force and AFRL to join Air Force Data Analytics Expert Panel to inf orm the Air Force Strategy 2030\, and invited to speak at the Federal Info rmation Integrity R&D Interagency Working Group (IIRD IWG) briefing in 202 3. She is the lead of many multi-institution projects and tasks\, includin g the U.S. ARL projects on information fusion and knowledge networks const ruction\, DARPA ECOLE MIRACLE team\, DARPA KAIROS RESIN team and DARPA DEF T Tinker Bell team. She has coordinated the NIST TAC Knowledge Base Popula tion task 2010-2022. She was the associate editor for IEEE/ACM Transaction on Audio\, Speech\, and Language Processing\, and served as the Program C ommittee Co-Chair of many conferences including NAACL-HLT2018 and AACL-IJC NLP2022. She is elected as the North American Chapter of the Association f or Computational Linguistics (NAACL) secretary 2020-2023. Her research has been widely supported by the U.S. government agencies (DARPA\, NSF\, DoE\ , ARL\, IARPA\, AFRL\, DHS) and industry (Apple\, Amazon\, Google\, Facebo ok\, Bosch\, IBM\, Disney).
DTSTART;TZID=America/New_York:20240405T120000 DTEND;TZID=America/New_York:20240405T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, Maryland 21218 SEQUENCE:0 SUMMARY:Heng Ji (University of Illinois Urbana-Champaign) “SmartBook: an AI Prophetess for Disaster Reporting and Forecasting” URL:https://www.clsp.jhu.edu/events/heng-ji-university-of-illinois-urbana-c hampaign-smartbook-an-ai-prophetess-for-disaster-reporting-and-forecasting / X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,April\,Ji END:VEVENT END:VCALENDAR