BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-21259@www.clsp.jhu.edu DTSTAMP:20240328T184312Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nNatural language processing has been revolutionized b y neural networks\, which perform impressively well in applications such a s machine translation and question answering. Despite their success\, neur al networks still have some substantial shortcomings: Their internal worki ngs are poorly understood\, and they are notoriously brittle\, failing on example types that are rare in their training data. In this talk\, I will use the unifying thread of hierarchical syntactic structure to discuss app roaches for addressing these shortcomings. First\, I will argue for a new evaluation paradigm based on targeted\, hypothesis-driven tests that bette r illuminate what models have learned\; using this paradigm\, I will show that even state-of-the-art models sometimes fail to recognize the hierarch ical structure of language (e.g.\, to conclude that “The book on the table is blue” implies “The table is blue.”) Second\, I will show how these beh avioral failings can be explained through analysis of models’ inductive bi ases and internal representations\, focusing on the puzzle of how neural n etworks represent discrete symbolic structure in continuous vector space. I will close by showing how insights from these analyses can be used to ma ke models more robust through approaches based on meta-learning\, structur ed architectures\, and data augmentation.\nBiography\nTom McCoy is a PhD c andidate in the Department of Cognitive Science at Johns Hopkins Universit y. As an undergraduate\, he studied computational linguistics at Yale. His research combines natural language processing\, cognitive science\, and m achine learning to study how we can achieve robust generalization in model s of language\, as this remains one of the main areas where current AI sys tems fall short. In particular\, he focuses on inductive biases and repres entations of linguistic structure\, since these are two of the major compo nents that determine how learners generalize to novel types of input. DTSTART;TZID=America/New_York:20220131T120000 DTEND;TZID=America/New_York:20220131T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Tom McCoy (Johns Hopkins University) “Opening the Black Box of Deep Learning: Representations\, Inductive Biases\, and Robustness” URL:https://www.clsp.jhu.edu/events/tom-mccoy-johns-hopkins-university-open ing-the-black-box-of-deep-learning-representations-inductive-biases-and-ro bustness/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\nNatural language processing has been revolutionized b y neural networks\, which perform impressively well in applications such a s machine translation and question answering. Despite their success\, neur al networks still have some substantial shortcomings: Their internal worki ngs are poorly understood\, and they are notoriously brittle\, failing on example types that are rare in their training data. In this talk\, I will use the unifying thread of hierarchical syntactic structure to discuss app roaches for addressing these shortcomings. First\, I will argue for a new evaluation paradigm based on targeted\, hypothesis-driven tests that bette r illuminate what models have learned\; using this paradigm\, I will show that even state-of-the-art models sometimes fail to recognize the hierarch ical structure of language (e.g.\, to conclude that “The book on the table is blue” implies “The table is blue.”) Second\, I will show how these beh avioral failings can be explained through analysis of models’ inductive bi ases and internal representations\, focusing on the puzzle of how neural n etworks represent discrete symbolic structure in continuous vector space. I will close by showing how insights from these analyses can be used to ma ke models more robust through approaches based on meta-learning\, structur ed architectures\, and data augmentation.
\nBiography
\nTom McCoy is a PhD candidate in the Department of Cognitive Sci ence at Johns Hopkins University. As an undergraduate\, he studied computa tional linguistics at Yale. His research combines natural language process ing\, cognitive science\, and machine learning to study how we can achieve robust generalization in models of language\, as this remains one of the main areas where current AI systems fall short. In particular\, he focuses on inductive biases and representations of linguistic structure\, since t hese are two of the major components that determine how learners generaliz e to novel types of input.
\n X-TAGS;LANGUAGE=en-US:2022\,January\,McCoy END:VEVENT BEGIN:VEVENT UID:ai1ec-23302@www.clsp.jhu.edu DTSTAMP:20240328T184312Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20230130T120000 DTEND;TZID=America/New_York:20230130T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Daniel Fried (CMU) URL:https://www.clsp.jhu.edu/events/daniel-fried-cmu/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Fried\,January END:VEVENT BEGIN:VEVENT UID:ai1ec-23586@www.clsp.jhu.edu DTSTAMP:20240328T184312Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20230410T120000 DTEND;TZID=America/New_York:20230410T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Ruizhe Huang URL:https://www.clsp.jhu.edu/events/student-seminar-ruizhe-huang/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Huang END:VEVENT BEGIN:VEVENT UID:ai1ec-23892@www.clsp.jhu.edu DTSTAMP:20240328T184312Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nThe growing power in computing and AI promises a near -term future of human-machine teamwork. In this talk\, I will present my r esearch group’s efforts in understanding the complex dynamics of human-mac hine interaction and designing intelligent machines aimed to assist and co llaborate with people. I will focus on 1) tools for onboarding machine tea mmates and authoring machine assistance\, 2) methods for detecting\, and b roadly managing\, errors in collaboration\, and 3) building blocks of know ledge needed to enable ad hoc human-machine teamwork. I will also highligh t our recent work on designing assistive\, collaborative machines to suppo rt older adults aging in place.\nBiography\nChien-Ming Huang is the John C . Malone Assistant Professor in the Department of Computer Science at the Johns Hopkins University. His research focuses on designing interactive AI aimed to assist and collaborate with people. He publishes in top-tier ven ues in HRI\, HCI\, and robotics including Science Robotics\, HRI\, CHI\, a nd CSCW. His research has received media coverage from MIT Technology Revi ew\, Tech Insider\, and Science Nation. Huang completed his postdoctoral t raining at Yale University and received his Ph.D. in Computer Science at t he University of Wisconsin–Madison. He is a recipient of the NSF CAREER aw ard. https://www.cs.jhu.edu/~cmhuang/ DTSTART;TZID=America/New_York:20230915T120000 DTEND;TZID=America/New_York:20230915T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Chien-Ming Huang (Johns Hopkins University) “Becoming Teammates: De signing Assistive\, Collaborative Machines” URL:https://www.clsp.jhu.edu/events/chien-ming-huang-johns-hopkins-universi ty/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nThe growing power in computing and AI promises a near -term future of human-machine teamwork. In this talk\, I will present my r esearch group’s efforts in understanding the complex dynamics of human-mac hine interaction and designing intelligent machines aimed to assist and co llaborate with people. I will focus on 1) tools for onboarding machine tea mmates and authoring machine assistance\, 2) methods for detecting\, and b roadly managing\, errors in collaboration\, and 3) building blocks of know ledge needed to enable ad hoc human-machine teamwork. I will also highligh t our recent work on designing assistive\, collaborative machines to suppo rt older adults aging in place.
\nBiography
\nChien-Ming Huang is the John C. Malone Assistant Professor in the Departm ent of Computer Science at the Johns Hopkins University. His research focu ses on designing interactive AI aimed to assist and collaborate with peopl e. He publishes in top-tier venues in HRI\, HCI\, and robotics including S cience Robotics\, HRI\, CHI\, and CSCW. His research has received media co verage from MIT Technology Review\, Tech Insider\, and Science Nation. Hua ng completed his postdoctoral training at Yale University and received his Ph.D. in Computer Science at the University of Wisconsin–Madison. He is a recipient of the NSF CAREER award. https://www .cs.jhu.edu/~cmhuang/
\n X-TAGS;LANGUAGE=en-US:2023\,Huang\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-24239@www.clsp.jhu.edu DTSTAMP:20240328T184312Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nNon-invasive neural interfaces have the potential to transform human-computer interaction by providing users with low friction\ , information rich\, always available inputs. Reality Labs at Meta is deve loping such an interface for the control of augmented reality devices base d on electromyographic (EMG) signals captured at the wrist. Speech and aud io technologies turn out to be especially well suited to unlocking the ful l potential of these signals and interactions and this talk will present s everal specific problems and the speech and audio approaches that have adv anced us towards this ultimate goal of effortless and joyful interfaces. W e will provide the necessary neuroscientific background to understand thes e signals\, describe automatic speech recognition-inspired interfaces gene rating text and beamforming-inspired interfaces for identifying individual neurons\, and then explain how they connect with egocentric machine intel ligence tasks that might reside on these devices.\nBiography\nMichael I Ma ndel is a Research Scientist in Reality Labs at Meta. Previously\, he was an Associate Professor of Computer and Information Science at Brooklyn Col lege and the CUNY Graduate Center working at the intersection of machine l earning\, signal processing\, and psychoacoustics. He earned his BSc in Co mputer Science from the Massachusetts Institute of Technology and his MS a nd PhD with distinction in Electrical Engineering from Columbia University as a Fu Foundation Presidential Scholar. He was an FQRNT Postdoctoral Res earch Fellow in the Machine Learning laboratory (LISA/MILA) at the Univers ité de Montréal\, an Algorithm Developer at Audience Inc\, and a Research Scientist in Computer Science and Engineering at the Ohio State University . His work has been supported by the National Science Foundation\, includi ng via a CAREER award\, the Alfred P. Sloan Foundation\, and Google\, Inc. DTSTART;TZID=America/New_York:20240129T120000 DTEND;TZID=America/New_York:20240129T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Michael I Mandel (Meta) “Speech and Audio Processing in Non-Invasiv e Brain-Computer Interfaces at Meta” URL:https://www.clsp.jhu.edu/events/michael-i-mandel-cuny/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nNon-invasive neural interfaces ha ve the potential to transform human-computer interaction by providing user s with low friction\, information rich\, always available inputs. Reality Labs at Meta is developing such an interface for the control of augmented reality devices based on electromyographic (EMG) signals captured at the w rist. Speech and audio technologies turn out to be especially well suited to unlocking the full potential of these signals and interactions and this talk will present several specific problems and the speech and audio appr oaches that have advanced us towards this ultimate goal of effortless and joyful interfaces. We will provide the necessary neuroscientific backgroun d to understand these signals\, describe automatic speech recognition-insp ired interfaces generating text and beamforming-inspired interfaces for id entifying individual neurons\, and then explain how they connect with egoc entric machine intelligence tasks that might reside on these devices.
\nBiography
\nMichael I Mandel is a Research Sci entist in Reality Labs at Meta. Previously\, he was an Associate Professor of Computer and Information Science at Brooklyn College and the CUNY Grad uate Center working at the intersection of machine learning\, signal proce ssing\, and psychoacoustics. He earned his BSc in Computer Science from th e Massachusetts Institute of Technology and his MS and PhD with distinctio n in Electrical Engineering from Columbia University as a Fu Foundation Pr esidential Scholar. He was an FQRNT Postdoctoral Research Fellow in the Ma chine Learning laboratory (LISA/MILA) at the Université de Montréal\, an A lgorithm Developer at Audience Inc\, and a Research Scientist in Computer Science and Engineering at the Ohio State University. His work has been su pported by the National Science Foundation\, including via a CAREER award\ , the Alfred P. Sloan Foundation\, and Google\, Inc.
\n X-TAGS;LANGUAGE=en-US:2024\,January\,Mandel END:VEVENT BEGIN:VEVENT UID:ai1ec-24479@www.clsp.jhu.edu DTSTAMP:20240328T184312Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract\nThe speech field is evolving to solve more challengin g scenarios\, such as multi-channel recordings with multiple simultaneous talkers. Given the many types of microphone setups out there\, we present the UniX-Encoder. It’s a universal encoder designed for multiple tasks\, a nd worked with any microphone array\, in both solo and multi-talker enviro nments. Our research enhances previous multichannel speech processing effo rts in four key areas: 1) Adaptability: Contrasting traditional models con strained to certain microphone array configurations\, our encoder is unive rsally compatible. 2) MultiTask Capability: Beyond the single-task focus o f previous systems\, UniX-Encoder acts as a robust upstream model\, adeptl y extracting features for diverse tasks including ASR and speaker recognit ion. 3) Self-Supervised Training: The encoder is trained without requiring labeled multi-channel data. 4) End-to-End Integration: In contrast to mod els that first beamform then process single-channels\, our encoder offers an end-to-end solution\, bypassing explicit beamforming or separation. To validate its effectiveness\, we tested the UniXEncoder on a synthetic mult i-channel dataset from the LibriSpeech corpus. Across tasks like speech re cognition and speaker diarization\, our encoder consistently outperformed combinations like the WavLM model with the BeamformIt frontend. DTSTART;TZID=America/New_York:20240311T200500 DTEND;TZID=America/New_York:20240311T210500 SEQUENCE:0 SUMMARY:Zili Huang (JHU) “Unix-Encoder: A Universal X-Channel Speech Encode r for Ad-Hoc Microphone Array Speech Processing” URL:https://www.clsp.jhu.edu/events/zili-huang-jhu-unix-encoder-a-universal -x-channel-speech-encoder-for-ad-hoc-microphone-array-speech-processing/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nThe speech field is evolving to solve more challenging scenarios\, such as multi-channel recordings wi th multiple simultaneous talkers. Given the many types of microphone setup s out there\, we present the UniX-Encoder. It’s a universal encoder design ed for multiple tasks\, and worked with any microphone array\, in both sol o and multi-talker environments. Our research enhances previous multichann el speech processing efforts in four key areas: 1) Adaptability: Contrasti ng traditional models constrained to certain microphone array configuratio ns\, our encoder is universally compatible. 2) MultiTask Capability: Beyon d the single-task focus of previous systems\, UniX-Encoder acts as a robus t upstream model\, adeptly extracting features for diverse tasks including ASR and speaker recognition. 3) Self-Supervised Training: The encoder is trained without requiring labeled multi-channel data. 4) End-to-End Integr ation: In contrast to models that first beamform then process single-chann els\, our encoder offers an end-to-end solution\, bypassing explicit beamf orming or separation. To validate its effectiveness\, we tested the UniXEn coder on a synthetic multi-channel dataset from the LibriSpeech corpus. Ac ross tasks like speech recognition and speaker diarization\, our encoder c onsistently outperformed combinations like the WavLM model with the Beamfo rmIt frontend.
\n X-TAGS;LANGUAGE=en-US:2024\,Huang\,March END:VEVENT END:VCALENDAR