BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-21259@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nNatural language processin g has been revolutionized by neural networks\, which perform impressively well in applications such as machine translation and question answering. D espite their success\, neural networks still have some substantial shortco mings: Their internal workings are poorly understood\, and they are notori ously brittle\, failing on example types that are rare in their training d ata. In this talk\, I will use the unifying thread of hierarchical syntact ic structure to discuss approaches for addressing these shortcomings. Firs t\, I will argue for a new evaluation paradigm based on targeted\, hypothe sis-driven tests that better illuminate what models have learned\; using t his paradigm\, I will show that even state-of-the-art models sometimes fai l to recognize the hierarchical structure of language (e.g.\, to conclude that “The book on the table is blue” implies “The table is blue.”) Second\ , I will show how these behavioral failings can be explained through analy sis of models’ inductive biases and internal representations\, focusing on the puzzle of how neural networks represent discrete symbolic structure i n continuous vector space. I will close by showing how insights from these analyses can be used to make models more robust through approaches based on meta-learning\, structured architectures\, and data augmentation.
\nBiography
\nTom McCoy is a PhD candidate in the Department of Cognitive Science at Johns Hopkins University. As an undergr aduate\, he studied computational linguistics at Yale. His research combin es natural language processing\, cognitive science\, and machine learning to study how we can achieve robust generalization in models of language\, as this remains one of the main areas where current AI systems fall short. In particular\, he focuses on inductive biases and representations of lin guistic structure\, since these are two of the major components that deter mine how learners generalize to novel types of input.
DTSTART;TZID=America/New_York:20220131T120000 DTEND;TZID=America/New_York:20220131T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Tom McCoy (Johns Hopkins University) “Opening the Black Box of Deep Learning: Representations\, Inductive Biases\, and Robustness” URL:https://www.clsp.jhu.edu/events/tom-mccoy-johns-hopkins-university-open ing-the-black-box-of-deep-learning-representations-inductive-biases-and-ro bustness/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,January\,McCoy END:VEVENT BEGIN:VEVENT UID:ai1ec-21267@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nIn this talk\, I present a multipronged strategy for zero-shot cross-lingual Information Extraction\ , that is the construction of an IE model for some target language\, given existing annotations exclusively in some other language. This work is par t of the JHU team’s effort under the IARPA BETTER program. I explore data augmentation techniques including data projection and self-training\, and how different pretrained encoders impact them. We find through extensive e xperiments and extension of techniques that a combination of approaches\, both new and old\, leads to better performance than any one cross-lingual strategy in particular.
\nBiography
\nAbstract
\nSocial media allows resear chers to track societal and cultural changes over time based on language a nalysis tools. Many of these tools rely on statistical algorithms which ne ed to be tuned to specific types of language. Recent studies have question ed the robustness of longitudinal analyses based on statistical methods du e to issues of temporal bias and semantic shift. To what extent are change s in semantics over time affecting the reliability of longitudinal analyse s? We examine this question through a case study: understanding shifts in mental health during the course of the COVID-19 pandemic. We demonstrate t hat a recently-introduced method for measuring semantic shift may be used to proactively identify failure points of language-based models and improv e predictive generalization over time. Ultimately\, we find that these ana lyses are critical to producing accurate longitudinal studies of social me dia.
DTSTART;TZID=America/New_York:20220207T120000 DTEND;TZID=America/New_York:20220207T131500 LOCATION:In Person or Virtual Option @ https://wse.zoom.us/j/96735183473 @ 234 Ames Hall\, 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Keith Harrigian “The Problem of Semantic Shift in Longitudinal Monitoring of Social Media: A Case Study on Mental Health d uring the COVID-19 Pandemic” URL:https://www.clsp.jhu.edu/events/student-seminar-keith-harrigian-the-pro blem-of-semantic-shift-in-longitudinal-monitoring-of-social-media-a-case-s tudy-on-mental-health-during-the-covid-19-pandemic/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Harrigian END:VEVENT BEGIN:VEVENT UID:ai1ec-21275@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\n\n\n\n\nAutomatic discovery of phon e or word-like units is one of the core objectives in zero-resource speech processing. Recent attempts employ contrastive predictive coding (CPC)\, where the model learns representations by predicting the next frame given past context. However\, CPC only looks at the audio signal’s structure at the frame level. The speech structure exists beyond frame-level\, i.e.\, a t phone level or even higher. We propose a segmental contrastive predictiv e coding (SCPC) framework to learn from the signal structure at both the f rame and phone levels.\n\n\nSCPC is a hierarchical model with three stages trained in an end-to-end m anner. In the first stage\, the model predicts future feature frames and e xtracts frame-level representation from the raw waveform. In the second st age\, a differentiable boundary detector finds variable-length segments. I n the last stage\, the model predicts future segments to learn segment rep resentations. Experiments show that our model outperforms existing phone a nd word segmentation methods on TIMIT and Buckeye datasets.
Abstract
\nAs humans\, our understand
ing of language is grounded in a rich mental model about “how the world wo
rks” – that we learn through perception and interaction. We use this under
standing to reason beyond what we literally observe or read\, imagining ho
w situations might unfold in the world. Machines today struggle at this ki
nd of reasoning\, which limits how they can communicate with humans.
In my talk\, I will discuss th
ree lines of work to bridge this gap between machines and humans. I will f
irst discuss how we might measure grounded understanding. I will introduce
a suite of approaches for constructing benchmarks\, using machines in the
loop to filter out spurious biases. Next\, I will introduce PIGLeT: a mod
el that learns physical commonsense understanding by interacting with the
world through simulation\, using this knowledge to ground language. From a
n English-language description of an event\, PIGLeT can anticipate how the
world state might change – outperforming text-only models that are orders
of magnitude larger. Finally\, I will introduce MERLOT\, which learns abo
ut situations in the world by watching millions of YouTube videos with tra
nscribed speech. Through training objectives inspired by the developmental
psychology idea of multimodal reentry\, MERLOT learns to fuse language\,
vision\, and sound together into powerful representations. Together\, these directions suggest a pa
th forward for building machines that learn language rooted in the world.<
/p>\n
Biography
\nRowan Zellers is a final year P hD candidate at the University of Washington in Computer Science & Enginee ring\, advised by Yejin Choi and Ali Farhadi. His research focuses on enab ling machines to understand language\, vision\, sound\, and the world beyo nd these modalities. He has been recognized through an NSF Graduate Fellow ship and a NeurIPS 2021 outstanding paper award. His work has appeared in several media outlets\, including Wired\, the Washington Post\, and the Ne w York Times. In the past\, he graduated from Harvey Mudd College with a B .S. in Computer Science & Mathematics\, and has interned at the Allen Inst itute for AI.
DTSTART;TZID=America/New_York:20220214T120000 DTEND;TZID=America/New_York:20220214T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rowan Zellers (University of Washington) ” Grounding Language by Se eing\, Hearing\, and Interacting” URL:https://www.clsp.jhu.edu/events/rowan-zellers-university-of-washington- grounding-language-by-seeing-hearing-and-interacting/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Zellers END:VEVENT BEGIN:VEVENT UID:ai1ec-21280@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAs AI-driven lan guage interfaces (such as chat-bots) become more integrated into our lives \, they need to become more versatile and reliable in their communication with human users. How can we make progress toward building more “general” models that are capable of understanding a broader spectrum of language co mmands\, given practical constraints such as the limited availability of l abeled data?
\nIn this talk\, I will describe my research toward addressing this question along two dimensions of generality. First I will discuss progress in “breadth” — models that address a wider variety of tasks and abilities\, drawing inspiration from existing statistical le arning techniques such as multi-task learning. In particular\, I will show case a system that works well on several QA benchmarks\, resulting in stat e-of-the-art results on 10 benchmarks. Furthermore\, I will show its exten sion to tasks beyond QA (such as text generation or classification) that c an be “defined” via natural language. In the second part\, I will focus o n progress in “depth” — models that can handle complex inputs such as comp ositional questions. I will introduce Text Modular Networks\, a general fr amework that casts problem-solving as natural language communication among simpler “modules.” Applying this framework to compositional questions by leveraging discrete optimization and existing non-compositional closed-box QA models results in a model with strong empirical performance on multipl e complex QA benchmarks while providing human-readable reasoning.
\nI will conclude with future research directions toward broader N LP systems by addressing the limitations of the presented ideas and other missing elements needed to move toward more general-purpose interactive la nguage understanding systems.
\nBiography
\nDaniel Khashabi is a postdoctoral researcher at the Al len Institute for Artificial Intelligence (AI2)\, Seattle. Previously\, he completed his Ph.D. in Computer and Information Sciences at the Universit y of Pennsylvania in 2019. His interests lie at the intersection of artifi cial intelligence and natural language processing\, with a vision toward m ore general systems through unified algorithms and theories.
DTSTART;TZID=America/New_York:20220218T120000 DTEND;TZID=America/New_York:20220218T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Daniel Khashabi (Allen Institute for Artificial Intelligence) “The Quest Toward Generality in Natural Language Understanding” URL:https://www.clsp.jhu.edu/events/daniel-khashabi-allen-institute-for-art ificial-intelligence/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Khashabi END:VEVENT BEGIN:VEVENT UID:ai1ec-21487@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAbstract
\nSince it is increasingly h arder to opt out from interacting with AI technology\, people demand that AI is capable of maintaining contracts such that it supports agency and ov ersight of people who are required to use it or who are affected by it. To help those people create a mental model about how to interact with AI sys tems\, I extend the underlying models to self-explain—predict the label/an swer and explain this prediction. In this talk\, I will present how to gen erate (1) free-text explanations given in plain English that immediately t ell users the gist of the reasoning\, and (2) contrastive explanations tha t help users understand how they could change the text to get another labe l.
\nBiography
\nAna Marasović is a postdocto ral researcher at the Allen Institute for AI (AI2) and the Paul G. Allen S chool of Computer Science & Engineering at University of Washington. Her r esearch interests broadly lie in the fields of natural language processing \, explainable AI\, and vision-and-language learning. Her projects are mot ivated by a unified goal: improve interaction and control of the NLP syste ms to help people make these systems do what they want with the confidence that they’re getting exactly what they need. Prior to joining AI2\, Ana o btained her PhD from Heidelberg University.
\nHow to pronounce my name: the first name is Ana like in Spanish\, i.e.\, with a long “a” like in “water”\; regarding the last name: “mara” as in actress mara wilso n + “so” + “veetch”.
DTSTART;TZID=America/New_York:20220228T120000 DTEND;TZID=America/New_York:20220228T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Ana Marasović (Allen Institute for AI & University of Washington) “ Self-Explaining for Intuitive Interaction with AI” URL:https://www.clsp.jhu.edu/events/ana-marasovic-allen-institute-for-ai-un iversity-of-washington-self-explaining-for-intuitive-interaction-with-ai/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Marasovic END:VEVENT BEGIN:VEVENT UID:ai1ec-21494@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nAdversarial atta cks deceive neural network systems by adding carefully crafted perturbatio ns to benign signals. Being almost imperceptible to humans\, these attacks pose a severe security threat to the state-of-the-art speech and speaker recognition systems\, making it vital to propose countermeasures against t hem. In this talk\, we focus on 1) classification of a given adversarial a ttack into attack algorithm type\, threat model type\, and signal-to-adver sarial-noise ratios\, 2) developing a novel speech denoising solution to f urther improve the classification performance.
\nO ur proposed approach uses an x-vector network as a signature extractor to get embeddings\, which we call signatures. These signatures contain inform ation about the attack and can help classify different attack algorithms\, threat models\, and signal-to-adversarial-noise ratios. We demonstrate th e transferability of such signatures to other tasks. In particular\, a sig nature extractor trained to classify attacks against speaker identificatio n can also be used to classify attacks against speaker verification and sp eech recognition. We also show that signatures can be used to detect unkno wn attacks i.e. attacks not included during training. Lastly\, we propose to improve the signature extractor by making the job of the signature ext ractor easier by removing the clean signal from the adversarial example (w hich consists of clean signal+perturbation). We train our signature extrac tor using adversarial perturbation. At inference time\, we use a time-doma in denoiser to obtain adversarial perturbation from adversarial examples. Using our improved approach\, we show that common attacks in the literatur e (Fast Gradient Sign Method (FGSM)\, Projected Gradient Descent (PGD)\, C arlini-Wagner (CW) ) can be classified with accuracy as high as 96%. We al so detect unknown attacks with an equal error rate (EER) of about 9%\, whi ch is very promising.
DTSTART;TZID=America/New_York:20220304T120000 DTEND;TZID=America/New_York:20220304T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Sonal Joshi “Classify and Detect Adversarial Atta cks Against Speaker and Speech Recognition Systems” URL:https://www.clsp.jhu.edu/events/student-seminar-sonal-joshi/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Joshi\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-21615@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nDTSTART;TZID=America/New_York:20220311T120000 DTEND;TZID=America/New_York:20220311T131500 LOCATION:Virtual Seminar SEQUENCE:0 SUMMARY:Student Seminar – Anton Belyy “Systems for Human-AI Cooperation on Collecting Semantic Annotations” URL:https://www.clsp.jhu.edu/events/student-seminar-anton-belyy-systems-for -human-ai-cooperation-on-collecting-semantic-annotations/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Belyy\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-21621@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nSystems that support expre ssive\, situated natural language interactions are essential for expanding access to complex computing systems\, such as robots and databases\, to n on-experts. Reasoning and learning in such natural language interactions i s a challenging open problem. For example\, resolving sentence meaning req uires reasoning not only about word meaning\, but also about the interacti on context\, including the history of the interaction and the situated env ironment. In addition\, the sequential dynamics that arise between user an d system in and across interactions make learning from static data\, i.e.\ , supervised data\, both challenging and ineffective. However\, these same interaction dynamics result in ample opportunities for learning from impl icit and explicit feedback that arises naturally in the interaction. This lays the foundation for systems that continually learn\, improve\, and ada pt their language use through interaction\, without additional annotation effort. In this talk\, I will focus on these challenges and opportunities. First\, I will describe our work on modeling dependencies between languag e meaning and interaction context when mapping natural language in interac tion to executable code. In the second part of the talk\, I will describe our work on language understanding and generation in collaborative interac tions\, focusing on continual learning from explicit and implicit user fee dback.
\nBiography
\nAlane Suhr is a PhD Cand idate in the Department of Computer Science at Cornell University\, advis ed by Yoav Artzi. Her research spans natural language processing\, machine learning\, and computer vision\, with a focus on building systems that pa rticipate and continually learn in situated natural language interactions with human users. Alane’s work has been recognized by paper awards at ACL and NAACL\, and has been supported by fellowships and grants\, including a n NSF Graduate Research Fellowship\, a Facebook PhD Fellowship\, and resea rch awards from AI2\, ParlAI\, and AWS. Alane has also co-organized multip le workshops and tutorials appearing at NeurIPS\, EMNLP\, NAACL\, and ACL. Previously\, Alane received a BS in Computer Science and Engineering as a n Eminence Fellow at the Ohio State University.
DTSTART;TZID=America/New_York:20220314T120000 DTEND;TZID=America/New_York:20220314T131500 LOCATION:Virtual Seminar SEQUENCE:0 SUMMARY:Alane Suhr (Cornell University) “Reasoning and Learning in Interact ive Natural Language Systems” URL:https://www.clsp.jhu.edu/events/alane-suhr-cornell-university-reasoning -and-learning-in-interactive-natural-language-systems/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,March\,Suhr END:VEVENT BEGIN:VEVENT UID:ai1ec-21616@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nSocial media allows resear chers to track societal and cultural changes over time based on language a nalysis tools. Many of these tools rely on statistical algorithms which ne ed to be tuned to specific types of language. Recent studies have shown th e absence of appropriate tuning\, specifically in the presence of semantic shift\, can hinder robustness of the underlying methods. However\, little is known about the practical effect this sensitivity may have on downstre am longitudinal analyses. We explore this gap in the literature through a timely case study: understanding shifts in depression during the course of the COVID-19 pandemic. We find that inclusion of only a small number of s emantically-unstable features can promote significant changes in longitudi nal estimates of our target outcome. At the same time\, we demonstrate tha t a recently-introduced method for measuring semantic shift may be used to proactively identify failure points of language-based models and\, in tur n\, improve predictive generalization.
DTSTART;TZID=America/New_York:20220318T120000 DTEND;TZID=America/New_York:20220318T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Keith Harrigian “The Problem of Semantic Shift in Longitudinal Monitoring of Social Media” URL:https://www.clsp.jhu.edu/events/student-seminar-keith-harrigian-the-pro blem-of-semantic-shift-in-longitudinal-monitoring-of-social-media/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Harrigian\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-21497@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nWhile the “deep learning t sunami” continues to define the state of the art in speech and language pr ocessing\, finite-state transducer grammars developed by linguists and eng ineers are still widely used in industrial\, highly-multilingual settings\ , particularly for symbolic\, “front-end” speech applications. In this tal k\, I will first briefly review the current state of the OpenFst and OpenG rm finite-state transducer libraries. I then review two “late-breaking” al gorithms found in these libraries. The first is a heuristic but highly-eff ective general-purpose optimization routine for weighted transducers. The second is an algorithm for computing the single shortest string of non-det erministic weighted acceptors which lack certain properties required by cl assic shortest-path algorithms. I will then illustrate how the OpenGrm too ls can be used to induce a finite-state string-to-string transduction mode l known as a pair n-gram model. This model has been applied to grapheme-to -phoneme conversion\, loanword detection\, abbreviation expansion\, and ba ck-transliteration\, among other tasks.
\nBiography
\nKyle Gorman is an assistant professor of linguistics at the Gradu ate Center\, City University of New York\, and director of the master’s pr ogram in computational linguistics\; he is also a software engineer in the speech and language algorithms group at Google. With Richard Sproat\, he is the coauthor of Finite-State Text Processing (Morgan & Claypool\ , 2021) and the creator of Pynini\, a finite-state text processing library for Python. He has also published on statistical methods for comparing co mputational models\, text normalization\, grapheme-to-phoneme conversion\, and morphological analysis\, as well as many topics in linguistic theory.
DTSTART;TZID=America/New_York:20220401T120000 DTEND;TZID=America/New_York:20220401T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Kyle Gorman (City University of New York) ” Weighted Finite-State T ransducers: The Later Years” URL:https://www.clsp.jhu.edu/events/kyle-gorman-city-university-of-new-york -weighted-finite-state-transducers-the-later-years/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Gorman\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-22374@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nIn recent years\, the fiel d of Natural Language Processing has seen a profusion of tasks\, datasets\ , and systems that facilitate reasoning about real-world situations throug h language (e.g.\, RTE\, MNLI\, COMET). Such systems might\, for example\, be trained to consider a situation where “somebody dropped a glass on the floor\,” and conclude it is likely that “the glass shattered” as a result . In this talk\, I will discuss three pieces of work that revisit assumpti ons made by or about these systems. In the first work\, I develop a Defeas ible Inference task\, which enables a system to recognize when a prior ass umption it has made may no longer be true in light of new evidence it rece ives. The second work I will discuss revisits partial-input baselines\, wh ich have highlighted issues of spurious correlations in natural language r easoning datasets and led to unfavorable assumptions about models’ reasoni ng abilities. In particular\, I will discuss experiments that show models may still learn to reason in the presence of spurious dataset artifacts. F inally\, I will touch on work analyzing harmful assumptions made by reason ing models in the form of social stereotypes\, particularly in the case of free-form generative reasoning models.
\nBiography
\nRachel Rudinger is an Assistant Professor in the Department of Co mputer Science at the University of Maryland\, College Park. She holds joi nt appointments in the Department of Linguistics and the Institute for Adv anced Computer Studies (UMIACS). In 2019\, Rachel completed her Ph.D. in C omputer Science at Johns Hopkins University in the Center for Language and Speech Processing. From 2019-2020\, she was a Young Investigator at the A llen Institute for AI in Seattle\, and a visiting researcher at the Univer sity of Washington. Her research interests include computational semantics \, common-sense reasoning\, and issues of social bias and fairness in NLP.
DTSTART;TZID=America/New_York:20220916T120000 DTEND;TZID=America/New_York:20220916T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rachel Rudinger (University of Maryland\, College Park) “Not So Fas t!: Revisiting Assumptions in (and about) Natural Language Reasoning” URL:https://www.clsp.jhu.edu/events/rachel-rudinger-university-of-maryland- college-park-not-so-fast-revisiting-assumptions-in-and-about-natural-langu age-reasoning/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Rudinger\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-22375@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nI will present our work on data augmentation using style transfer as a way to im prove domain adaptation in sequence labeling tasks. The target domain is s ocial media data\, and the task is named entity recognition (NER). The pre mise is that we can transform the labelled out of domain data into somethi ng that stylistically is more closely related to the target data. Then we can train a model on a combination of the generated data and the smaller a mount of in domain data to improve NER prediction performance. I will show recent empirical results on these efforts.
\nIf time allows\, I will also give an overview of other research projects I’m currently leading at RiTUAL (Research in Text Understanding and Analysis of Language) lab. The common thread among all these research problems is t he scarcity of labeled data.
\nBiography
\nThamar Solorio is a Professor of Com puter Science at the University of Houston (UH). She holds graduate degree s in Computer Science from the Instituto Nacional de Astrofísica\, Óptica y Electrónica\, in Puebla\, Mexico. Her research interests include informa tion extraction from social media data\, enabling technology for code-swit ched data\, stylistic modeling of text\, and more recently multimodal appr oaches for online content understanding. She is the director and founder o f the RiTUAL Lab at UH. She is the recipient of an NSF CAREER award for he r work on authorship attribution\, and recipient of the 2014 Emerging Lead er ABIE Award in Honor of Denice Denton. She is currently serving a second term as an elected board member of the North American Chapter of the Asso ciation of Computational Linguistics and was PC co-chair for NAACL 2019. S he recently joined the team of Editors in Chief for the ACL Rolling Review (ARR) system. Her research is currently funded by the NSF and by ADOBE. p> DTSTART;TZID=America/New_York:20220923T120000 DTEND;TZID=America/New_York:20220923T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Thamar Solorio (University of Houston) “Style Transfer for Data Aug mentation in Sequence Labeling Tasks” URL:https://www.clsp.jhu.edu/events/thamar-solorio-university-of-houston-st yle-transfer-for-data-augmentation-in-sequence-labeling-tasks/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,September\,Solorio END:VEVENT BEGIN:VEVENT UID:ai1ec-22380@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nThe availability of large multilingual pre-trained language models has opened up exciting pathways f or developing NLP technologies for languages with scarce resources. In thi s talk I will advocate for the need to go beyond the most common languages in multilingual evaluation\, and on the challenges of handling new\, unse en-during-training languages and varieties. I will also share some of my e xperiences with working with indigenous and other endangered language comm unities and activists.
\nBiography
\nAntonios Anastasopoulos is an As sistant Professor in Computer Science at George Mason University. In 2019\ , Antonis received his PhD in Computer Science from the University of Notr e Dame and then worked as a postdoctoral researcher at the Language Techno logies Institute at Carnegie Mellon University. His research interests rev olve around computational linguistics and natural language processing with a focus on low-resource settings\, endangered languages\, and cross-lingu al learning.
\nDTSTART;TZID=America/New_York:20220930T120000 DTEND;TZID=America/New_York:20220930T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Antonios Anastasopoulos (George Mason University) “NLP Beyond the T op-100 Languages” URL:https://www.clsp.jhu.edu/events/antonis-anastasopoulos-george-mason-uni versity/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Anastasopoulos\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-22423@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20221007T120000 DTEND;TZID=America/New_York:20221007T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Ariya Rastrow (Amazon) URL:https://www.clsp.jhu.edu/events/ariya-rastrow-amazon-2/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,October\,Rastrow END:VEVENT BEGIN:VEVENT UID:ai1ec-22394@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nModel robustness and spurious correlations have received increasing atten tion in the NLP community\, both in methods and evaluation. The term “spur ious correlation” is overloaded though and can refer to any undesirable sh ortcuts learned by the model\, as judged by domain experts.
\nWhen designing mitigation algorithms\, we oft en (implicitly) assume that a spurious feature is irrelevant for predictio n. However\, many features in NLP (e.g. word overlap and negation) are not spurious in the sense that the background is spurious for classifying obj ects in an image. In contrast\, they carry important information that’s ne eded to make predictions by humans. In this talk\, we argue that it is mor e productive to characterize features in terms of their necessity and suff iciency for prediction. We then discuss the implications of this categoriz ation in representation\, learning\, and evaluation.
\nBiogr aphy
\nHe He is an Assistant Professor in the Department of Computer Science and the Center for Data Science at New York University. She obtained her PhD in Computer Science at the University of Maryland\, C ollege Park. Before joining NYU\, she spent a year at AWS AI and was a pos t-doc at Stanford University before that. She is interested in building ro bust and trustworthy NLP systems in human-centered settings. Her recent re search focus includes robust language understanding\, collaborative text g eneration\, and understanding capabilities and issues of large language mo dels.
\n DTSTART;TZID=America/New_York:20221014T120000 DTEND;TZID=America/New_York:20221014T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:He He (New York University) “What We Talk about When We Talk about Spurious Correlations in NLP” URL:https://www.clsp.jhu.edu/events/he-he-new-york-university/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,He\,October END:VEVENT BEGIN:VEVENT UID:ai1ec-22395@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAbstract
\nModern learning architectures for natural language processing have been very suc cessful in incorporating a huge amount of texts into their parameters. How ever\, by and large\, such models store and use knowledge in distributed a nd decentralized ways. This proves unreliable and makes the models ill-sui ted for knowledge-intensive tasks that require reasoning over factual info rmation in linguistic expressions. In this talk\, I will give a few examp les of exploring alternative architectures to tackle those challenges. In particular\, we can improve the performance of such (language) models by r epresenting\, storing and accessing knowledge in a dedicated memory compon ent.
\nThis talk is based on several joint works with Yury Zemlyanskiy (Google Research)\, Michiel de Jong (USC and Google Research)\, William Cohen (Google Research and CMU) and our other collabo rators in Google Research.
\nBiography
\nFei is a research scientist at Google Research. Before that\, he was a Profess or of Computer Science at University of Southern California. His primary r esearch interests are machine learning and its application to various AI p roblems: speech and language processing\, computer vision\, robotics and r ecently weather forecast and climate modeling. He has a PhD (2007) from Computer and Information Science from U. of Pennsylvania and B.Sc and M.Sc in Biomedical Engineering from Southeast University (Nanjing\, China).
DTSTART;TZID=America/New_York:20221024T120000 DTEND;TZID=America/New_York:20221024T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Fei Sha (University of Southern California) “Extracting Information from Text into Memory for Knowledge-Intensive Tasks” URL:https://www.clsp.jhu.edu/events/fei-sha-university-of-southern-californ ia/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,October\,Sha END:VEVENT BEGIN:VEVENT UID:ai1ec-22403@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nVoice conversion (VC) is a significant aspect of artificial intelligence. It is the study of how to convert one’s voice to sound like that of another without changing the lin guistic content. Voice conversion belongs to a general technical field of speech synthesis\, which converts text to speech or changes the properties of speech\, for example\, voice identity\, emotion\, and accents. Voice c onversion involves multiple speech processing techniques\, such as speech analysis\, spectral conversion\, prosody conversion\, speaker characteriza tion\, and vocoding. With the recent advances in theory and practice\, we are now able to produce human-like voice quality with high speaker similar ity. In this talk\, Dr. Sisman will present the recent advances in voice c onversion and discuss their promise and limitations. Dr. Sisman will also provide a summary of the available resources for expressive voice conversi on research.
\nBiography
\nDr. Berrak Sisman (Member\, IEEE) received the Ph.D. degree in electrical and computer engin eering from National University of Singapore in 2020\, fully funded by A*S TAR Graduate Academy under Singapore International Graduate Award (SINGA). She is currently working as a tenure-track Assistant Professor at the Eri k Jonsson School Department of Electrical and Computer Engineering at Univ ersity of Texas at Dallas\, United States. Prior to joining UT Dallas\, sh e was a faculty member at Singapore University of Technology and Design (2 020-2022). She was a Postdoctoral Research Fellow at the National Universi ty of Singapore (2019-2020). She was an exchange doctoral student at the U niversity of Edinburgh and a visiting scholar at The Centre for Speech Tec hnology Research (CSTR)\, University of Edinburgh (2019). She was a visiti ng researcher at RIKEN Advanced Intelligence Project in Japan (2018). Her research is focused on machine learning\, signal processing\, emotion\, sp eech synthesis and voice conversion.
\nDr. Sisman has served as the Area Chair at INTERSPEECH 2021\, INTERSPEECH 2022\, IEEE SLT 2022 and as t he Publication Chair at ICASSP 2022. She has been elected as a member of t he IEEE Speech and Language Processing Technical Committee (SLTC) in the a rea of Speech Synthesis for the term from January 2022 to December 2024. S he plays leadership roles in conference organizations and active in techni cal committees. She has served as the General Coordinator of the Student A dvisory Committee (SAC) of International Speech Communication Association (ISCA).
DTSTART;TZID=America/New_York:20221104T120000 DTEND;TZID=America/New_York:20221104T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Berrak Sisman (University of Texas at Dallas) “Speech Synthesis and Voice Conversion: Machine Learning can Mimic Anyone’s Voice” URL:https://www.clsp.jhu.edu/events/berrak-sisman-university-of-texas-at-da llas/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,November\,Sisman END:VEVENT BEGIN:VEVENT UID:ai1ec-22408@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAbstract
\nDriven by the goal of erad icating language barriers on a global scale\, machine translation has soli dified itself as a key focus of artificial intelligence research today. Ho wever\, such efforts have coalesced around a small subset of languages\, l eaving behind the vast majority of mostly low-resource languages. What doe s it take to break the 200 language barrier while ensuring safe\, high-qua lity results\, all while keeping ethical considerations in mind? In this t alk\, I introduce No Language Left Behind\, an initiative to break languag e barriers for low-resource languages. In No Language Left Behind\, we too k on the low-resource language translation challenge by first contextualiz ing the need for translation support through exploratory interviews with n ative speakers. Then\, we created datasets and models aimed at narrowing t he performance gap between low and high-resource languages. We proposed mu ltiple architectural and training improvements to counteract overfitting w hile training on thousands of tasks. Critically\, we evaluated the perform ance of over 40\,000 different translation directions using a human-transl ated benchmark\, Flores-200\, and combined human evaluation with a novel t oxicity benchmark covering all languages in Flores-200 to assess translati on safety. Our model achieves an improvement of 44% BLEU relative to the p revious state-of-the-art\, laying important groundwork towards realizing a universal translation system in an open-source manner.
\nBi ography
\nAngela is a research scientis t at Meta AI Research in New York\, focusing on supporting efforts in spee ch and language research. Recent projects include No Language Left Behind (https://ai.facebook.com/r esearch/no-language-left-behind/) and Universal Speech Translation for Unwritten Languages (https://ai.faceb ook.com/blog/ai-translation-hokkien/). Before translation\, Angela pre viously focused on research in on-device models for NLP and computer visio n and text generation.
\nDTSTART;TZID=America/New_York:20221118T120000 DTEND;TZID=America/New_York:20221118T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Angela Fan (Meta AI Research) “No Language Left Behind: Scaling Hu man-Centered Machine Translation” URL:https://www.clsp.jhu.edu/events/angela-fan-facebook/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Fan\,November END:VEVENT BEGIN:VEVENT UID:ai1ec-22417@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nOne of the keys to success in machine learning applications is to improve each user’s personal exper ience via personalized models. A personalized model can be a more resource -efficient solution than a general-purpose model\, too\, because it focuse s on a particular sub-problem\, for which a smaller model architecture can be good enough. However\, training a personalized model requires data fro m the particular test-time user\, which are not always available due to th eir private nature and technical challenges. Furthermore\, such data tend to be unlabeled as they can be collected only during the test time\, once after the system is deployed to user devices. One could rely on the genera lization power of a generic model\, but such a model can be too computatio nally/spatially complex for real-time processing in a resource-constrained device. In this talk\, I will present som e techniques to circumvent the lack of labeled personal data in the contex t of speech enhancement. Our machine learning models will require zero or few data samples from the test-time users\, while they can still achieve t he personalization goal. To this end\, we will investigate modularized spe ech enhancement models as well as the potential of self-supervised learnin g for personalized speech enhancement. Because our research achieves the p ersonalization goal in a data- and resource-efficient way\, it is a step t owards a more available and affordable AI for society.
\nBio graphy
\nMinje Kim is an associate professor in the Dept. of Intellig ent Systems Engineering at Indiana University\, where he leads his researc h group\, Signals and AI Group in Engineering (SAIGE). He is also an Amazo n Visiting Academic\, consulting for Amazon Lab126. At IU\, he is affiliat ed with various programs and labs such as Data Science\, Cognitive Science \, Dept. of Statistics\, and Center for Machine Learning. He earned his Ph .D. in the Dept. of Computer Science at the University of Illinois at Urba na-Champaign. Before joining UIUC\, He worked as a researcher at ETRI\, a national lab in Korea\, from 2006 to 2011. Before then\, he received his M aster’s and Bachelor’s degrees in the Dept. of Computer Science and Engine ering at POSTECH (Summa Cum Laude) and in the Division of Information and Computer Engineering at Ajou University (w ith honor) in 2006 and 2004\, respectively. He is a recipient of various a wards including NSF Career Award (2021)\, IU Trustees Teaching Award (2021 )\, IEEE SPS Best Paper Award (2020)\, and Google and Starkey’s grants for outstanding student papers in ICASSP 2013 and 2014\, respectively. He is an IEEE Senior Member and also a member of the IEEE Audio and Acoustic Sig nal Processing Technical Committee (2018-2023). He is serving as an Associ ate Editor for EURASIP Journal of Audio\, Speech\, and Music Processing\, and as a Consulting Associate Editor for IEEE Open Journal of Signal Proce ssing. He is also a reviewer\, program committee member\, or area chair fo r the major machine learning and signal processing. He filed more than 50 patent applications as an inventor.
DTSTART;TZID=America/New_York:20221202T120000 DTEND;TZID=America/New_York:20221202T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Minje Kim (Indiana University) “Personalized Speech Enhancement: Da ta- and Resource-Efficient Machine Learning” URL:https://www.clsp.jhu.edu/events/minje-kim-indiana-university/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,December\,Kim END:VEVENT BEGIN:VEVENT UID:ai1ec-22422@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nZipf’s law is commonly glo ssed by the aphorism “infrequent words are frequent\,” but in practice\, i t has often meant that there are three types of words: frequent\, infreque nt\, and out-of-vocabulary (OOV). Speech recognition solved the problem of frequent words in 1970 (with dynamic time warping). Hidden Markov models worked well for moderately infrequent words\, but the problem of OOV word s was not solved until sequence-to-sequence neural nets de-reified the con cept of a word. Many other social phenomena follow power-law distribution s. The number of native speakers of the N’th most spoken language\, for e xample\, is 1.44 billion over N to the 1.09. In languages with sufficient data\, we have shown that monolingual pre-training outperforms multilingu al pre-training. In less-frequent languages\, multilingual knowledge tran sfer can significantly reduce phone error rates. In languages with no tra ining data\, unsupervised ASR methods can be proven to converge\, as long as the eigenvalues of the language model are sufficiently well separated t o be measurable. Other systems of social categorization may follow similar power-law distributions. Disability\, for example\, can cause speech pat terns that were never seen in the training database\, but not all disabili ties need do so. The inability of speech technology to work for people wi th even common disabilities is probably caused by a lack of data\, and can probably be solved by finding better modes of interaction between technol ogy researchers and the communities served by technology.
\nBiography
\nMark Hasegawa-Johnson is a William L. Everitt F aculty Fellow of Electrical and Computer Engineering at the University of Illinois in Urbana-Champaign. He has published research in speech product ion and perception\, source separation\, voice conversion\, and low-resour ce automatic speech recognition.
DTSTART;TZID=America/New_York:20221209T120000 DTEND;TZID=America/New_York:20221209T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Mark Hasegawa-Johnson (University of Illinois Urbana-Champaign) “Zi pf’s Law Suggests a Three-Pronged Approach to Inclusive Speech Recognition ” URL:https://www.clsp.jhu.edu/events/mark-hasegawa-johnson-university-of-ill inois-urbana-champaign/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,December\,Hasegawa-Johnson END:VEVENT BEGIN:VEVENT UID:ai1ec-23304@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nTransformers are essential to pretraining. As we approach 5 years of BERT\, the connection between a ttention as architecture and transfer learning remains key to this central thread in NLP. Other architectures such as CNNs and RNNs have been used t o replicate pretraining results\, but these either fail to reach the same accuracy or require supplemental attention layers. This work revisits the semanal BERT result and considers pretraining without attention. We consid er replacing self-attention layers with recently developed approach for lo ng-range sequence modeling and transformer architecture variants. Specific ally\, inspired by recent papers like the structured space space sequence model (S4)\, we use simple routing layers based on state-space models (SSM ) and a bidirectional model architecture based on multiplicative gating. W e discuss the results of the proposed Bidirectional Gated SSM (BiGS) and p resent a range of analysis into its properties. Results show that architec ture does seem to have a notable impact on downstream performance and a di fferent inductive bias that is worth exploring further.
\nBi ography
\nAbstract
\nWhile large language model s have advanced the state-of-the-art in natural language processing\, thes e models are trained on large-scale datasets\, which may include harmful i nformation. Studies have shown that as a result\, the models exhibit socia l biases and generate misinformation after training. In this talk\, I will discuss my work on analyzing and interpreting the risks of large language models across the areas of fairness\, trustworthiness\, and safety. I wil l first describe my research in the detection of dialect bias between Afri can American English (AAE) vs. Standard American English (SAE). The second part investigates the trustworthiness of models through the memorization and subsequent generation of conspiracy theories. I will end my talk with recent work in AI safety regarding text that may lead to physical harm.
\nBiography
\nSharon is a 5th-year Ph.D. candid ate at the University of California\, Santa Barbara\, where she is advised by Professor William Wang. Her research interests lie in natural language processing\, with a focus on Responsible AI. Sharon’s research spans the subareas of fairness\, trustworthiness\, and safety\, with publications in ACL\, EMNLP\, WWW\, and LREC. She has spent summers interning at AWS\, Me ta\, and Pinterest. Sharon is a 2022 EECS Rising Star and a current recipi ent of the Amazon Alexa AI Fellowship for Responsible AI.
DTSTART;TZID=America/New_York:20230206T120000 DTEND;TZID=America/New_York:20230206T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Sharon Levy (University of California\, Santa Barbara) “Responsible AI via Responsible Large Language Models” URL:https://www.clsp.jhu.edu/events/sharon-levy-university-of-california-sa nta-barbara-responsible-ai-via-responsible-large-language-models/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,February\,Levy END:VEVENT BEGIN:VEVENT UID:ai1ec-23308@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nBiases in datasets\, or un intentionally introduced spurious cues\, are a common source of misspecifi cation in machine learning. Performant models trained on such data can gen der stereotype or be brittle under distribution shift. In this talk\, we present several results in multimodal and question answering applications studying sources of dataset bias\, and several mitigation methods. We pro pose approaches where known dimensions of dataset bias are explicitly fact ored out of a model during learning\, without needing to modify data. Fina lly\, we ask whether dataset biases can be attributable to annotator behav ior during annotation. Drawing inspiration from work in psychology on cogn itive biases\, we show certain behavioral patterns are highly indicative o f the creation of problematic (but valid) data instances in question answe ring. We give evidence that many existing observations around how dataset bias propagates to models can be attributed to data samples created by ann otators we identify.
\nBiography
\nMark Ya tskar is an Assistant Professor at University of Pennsylvania in th e department of Computer and Information Science. He did his PhD at Univer sity of Washington co-advised by Luke Zettlemoyer and Ali Farhadi. He was a Young Investigator at the Allen Institute for Artificial Intelligence fo r several years working with their computer vision team\, Prior. His work spans Natural Language Processing\, Computer Vision\, and Fairness in Mach ine Learning. He received a Best Paper Award at EMNLP for work on gender b ias amplification\, and his work has been featured in Wired and the New Yo rk Times.
\nDTSTART;TZID=America/New_York:20230210T120000 DTEND;TZID=America/New_York:20230210T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Mark Yatskar (University of Pennsylvania) “Understanding Dataset Bi ases: Behavioral Indicators During Annotation and Contrastive Mitigations” URL:https://www.clsp.jhu.edu/events/mark-yatskar-university-of-pennsylvania / X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,February\,Yatskar END:VEVENT BEGIN:VEVENT UID:ai1ec-23314@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nWhile GPT models have shown impressive performance on summa rization and open-ended text generation\, it’s important to assess their a bilities on more constrained text generation tasks that require significan t and diverse rewritings. In this talk\, I will discuss the challenges of evaluating systems that are highly competitive and perform close to humans on two such tasks: (i) paraphrase generation and (ii) text simplification . To address these challenges\, we introduce an interactive Rank-and-Rate evaluation framework. Our results show that GPT-3.5 has made a major step up from fine-tuned T5 in paraphrase generation\, but still lacks the diver sity and creativity of humans who spontaneously produce large quantities o f paraphrases.
\nAdditionally\, we demonstrate that GPT-3.5 performs similarly to a sin gle human in text simplification\, which makes it difficult for existing a utomatic evaluation metrics to distinguish between the two. To overcome th is shortcoming\, we propose LENS\, a learnable evaluation metric that outp erforms SARI\, BERTScore\, and other existing methods in both automatic ev aluation and minimal risk decoding for text generation.
\nBiography
\nWei Xu is an assistant professor in the School of Interactive Com puting at the Georgia Institute of Technology\, where she is also affiliat ed with the new NSF AI CARING Institute and Machine Learning Center. She r eceived her Ph.D. in Computer Science from New York University and her B.S . and M.S. from Tsinghua University. Xu’s research interests are in natura l language processing\, machine learning\, and social media\, with a focus on text generation\, stylistics\, robustness and controllability of machi ne learning models\, and reading and writing assistive technology. She is a recipient of the NSF CAREER Award\, CrowdFlower AI for Everyone Award\, Criteo Faculty Research Award\, and Best Paper Award at COLING’18. She has also received funds from DARPA and IARPA. She is an elected member of the NAACL executive board and regularly serves as a senior area chair for AI/ NLP conferences.
DTSTART;TZID=America/New_York:20230224T120000 DTEND;TZID=America/New_York:20230224T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Wei Xu (Georgia Tech) “GPT-3 vs Humans: Rethinking Evaluation of Na tural Language Generation” URL:https://www.clsp.jhu.edu/events/wei-xu-georgia-tech/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,February\,Xu END:VEVENT BEGIN:VEVENT UID:ai1ec-23316@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nUnderstanding the implicat ions underlying a text is critical to assessing its impact\, in particular the social dynamics that may result from a reading of the text. This requ ires endowing artificial intelligence (AI) systems with pragmatic reasonin g\, for example to correctly conclude that the statement “Epidemics and ca ses of disease in the 21st century are “staged”” relates to unfounded cons piracy theories. In this talk\, I discuss how shortcomings in the ability of current AI systems to reason about pragmatics present challenges to equ itable detection of false or harmful language. I demonstrate how these sho rtcomings can be addressed by imposing human-interpretable structure on de ep learning architectures using insights from linguistics.
\n< p> In the first part of the talk\, I descri be how adversarial text generation algorithms can be used to improve robus tness of content moderation systems. I then introduce a pragmatic formalis m for reasoning about harmful implications conveyed by social media text. I show how this pragmatic approach can be combined with generative neural language models to uncover implications of news headlines. I also address the bottleneck to progress in text generation posed by gaps in evaluation of factuality. I conclude by showing how context-aware content moderation can be used to ensure safe interactions with conversational agents. \nBiography
\nSaadia Gabriel is a PhD candidate in the Paul G. Al len School of Computer Science & Engineering at the University of Washingt on\, advised by Prof. Yejin Choi and Prof. Franziska Roesner. Her research revolves around natural language processing and m achine learning\, with a particular focus on building systems for understa nding how social commonsense manifests in text (i.e. how do people typical ly behave in social scenarios)\, as well as mitigating spread of false or harmful text (e.g. Covid-19 misinformation). Her work has been covered by a wide range of media outlets like Forbes and TechCrunch. It has also rece ived a 2019 ACL best short paper nomination\, a 2019 IROS RoboCup best pap er nomination and won a best paper award at the 2020 WeCNLP summit. Prior to her PhD\, Saadia received a BA summa cum laude from Mount Hol yoke College in Computer Science and Mathematics.
\nDTSTART;TZID=America/New_York:20230227T120000 DTEND;TZID=America/New_York:20230227T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Saadia Gabriel (University of Washington) “Socially Responsible and Factual Reasoning for Equitable AI Systems” URL:https://www.clsp.jhu.edu/events/saadia-gabriel-university-of-washington -socially-responsible-and-factual-reasoning-for-equitable-ai-systems/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,February\,Gabriel END:VEVENT BEGIN:VEVENT UID:ai1ec-23312@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nAdvanced neural language m odels have grown ever larger and more complex\, pushing forward the limits of language understanding and generation\, while diminishing interpretabi lity. The black-box nature of deep neural networks blocks humans from unde rstanding them\, as well as trusting and using them in real-world applicat ions. This talk will introduce interpretation techniques that bridge the g ap between humans and models for developing trustworthy natural language p rocessing
\n (NLP). I will first show how to explain black-box models and evaluate their explanations for understanding their p rediction behavior. Then I will introduce how to improve the interpretabil ity of neural language models by making their decision-making transparent and rationalized. Finally\, I will discuss how to diagnose and improve mod els (e.g.\, robustness) through the lens of explanations. I will conclude with future research directions that are centered around model interpretab ility and committed to facilitating communications and interactions betwee n intelligent machines\, system developers\, and end users for long-term t rustworthy AI.Biography
\nHanjie Chen is a Ph.D. candidate in Computer Science at the University of Virginia\, advis ed by Prof. Yangfeng Ji. Her research interests lie in Trustworthy AI\, Na tural Language Processing (NLP)\, and
DTSTART;TZID=America/New_York:20230313T120000 DTEND;TZID=America/New_York:20230313T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Hanjie Chen (University of Virginia) “Bridging Humans and Machines: Techniques for Trustworthy NLP” URL:https://www.clsp.jhu.edu/events/hanjie-chen-university-of-virginia/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Chen\,February END:VEVENT BEGIN:VEVENT UID:ai1ec-24241@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: Interpretabl e Machine Learning. She develops interpretation techniques to explain neur al language models and make their prediction behavior transparent and reli able. She is a recipient of the Carlos and Esther Farrar Fellowship and th e Best Poster Award at the ACM CAPWIC 2021. Her work has been published at top-tier NLP/AI conferences (e.g.\, ACL\, AAAI\, EMNLP\, NAACL) and selec ted by the National Center for Women & Information Technology (NCWIT) Coll egiate Award Finalist 2021. She (as the primary instructor) co-designed an d taught the course\, Interpretable Machine Learning\, and was awarded the UVA CS Outstanding Graduate Teaching Award and University-wide Graduate T eaching Awards Nominee (top 5% of graduate instructors). More details can be found at https://www.cs.virginia.edu/~hc9mxAbstract
\nOur research focuses on im proving speech processing algorithms\, such as automatic speech recognitio n (ASR)\, speaker identification\, and depression detection\, under challe nging conditions such as limited data (for example\, children’s or clinica l speech)\, mismatched conditions (for example\, training on read speech w hile recognizing conversational speech)\, and noisy speech\, using a hybri d data-driven and knowledge-based approach. This approach requires underst anding of both machine learning approaches and of the human speech product ion and perception systems. I will summarize in this talk our work on chil dren’s ASR using self-supervised models\, detecting depression from speech signals using novel speaker disentaglement techniques\, and automating sc oring of children’s reading tasks with both ASR and innovative NLP algorit hms.
\nBiography
\nAbeer Alwan received her P h.D. in Electrical Engineering and Computer Science from MIT in 1992. Sinc e then\, she has been with the ECE department at UCLA where she is now a F ull Professor and directs the Speech Processing and Auditory Perception La boratory. She is the recipient of the NSF Research Initiation and Career A wards\, NIH FIRST Award\, UCLA-TRW Excellence in Teaching Award\, Okawa Fo undation Award in Telecommunication\, and the Engineer’s Council Educator Award. She is a Fellow of the Acoustical Society of America\, IEEE\, and I nternational Speech Communication Assoc. (ISCA). She was a Fellow at the R adcliffe Institute\, Harvard University\, co-Editor in Chief of Speech Com munication\, Associate Editor of JASA and IEEE TSALP\, a Distinguished Lec turer of ISCA\, a member of the IEEE Signal Processing Board of Governers and she is currently on the advisory board of ISCA and the UCLA-Amazon Sci ence Hub for Humanity and AI.
DTSTART;TZID=America/New_York:20240202T120000 DTEND;TZID=America/New_York:20240202T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Abeer Alwan (UCLA) “Dealing with Limited Speech Data and Variabilit y: Three case studies” URL:https://www.clsp.jhu.edu/events/abeer-alwan-ucla/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,Alwan\,February END:VEVENT BEGIN:VEVENT UID:ai1ec-24425@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\n
Over the past three decades\, the fields of automatic speech recogn ition (ASR) and machine translation (MT) have witnessed remarkable advance ments\, leading to exciting research directions such as speech-to-text tra nslation (ST). This talk will delve into the domain of conversational ST\, an essential facet of daily communication\, which presents unique challen ges including spontaneous informal language\, the presence of disfluencies \, high context dependence and a scarcity of ST paired data.
\nAbstract
\nb>There is an enormous data gap between how AI systems and children learn language: The best LLMs now learn language from text with a word count in the trillions\, whereas it would take a child roughly 100K years to reach those numbers through speech (Frank\, 2023\, “Bridging the data gap”). The re is also a clear generalization gap: whereas machines struggle with syst ematic generalization\, people excel. For instance\, once a child learns h ow to “skip\,” they immediately know how to “skip twice” or “skip around t he room with their hands up” due to their compositional skills. In this ta lk\, I’ll describe two case studies in addressing these gaps:
\n1) The data gap: We train deep neural networks from scrat ch (using DINO\, CLIP\, etc.)\, not on large-scale data from the web\, but through the eyes and ears of a single child. Using head-mounted video rec ordings from a child (61 hours of video slices over 19 months)\, we show h ow deep neural networks can acquire many word-referent mappings\, generali ze to novel visual referents\, and achieve multi-modal alignment. Our resu lts demonstrate how today’s AI models are capable of learning key aspects of children’s early knowledge from realistic input.
\n2) The generalization gap: Can neural networks capture human-like s ystematic generalization? We address a 35-year-old debate catalyzed by Fod or and Pylyshyn’s classic article\, which argued that standard neural netw orks are not viable models of the mind because they lack systematic compos itionality — the algebraic ability to understand and produce novel combina tions from known components. We’ll show how neural network can achieve hum an-like systematic generalization when trained through meta-learning for c ompositionality (MLC)\, a new method for optimizing the compositional skil ls of neural networks through practice. With MLC\, a neural network can ma tch human performance and solve several machine learning benchmarks.
\nGiven this work\, we’ll discuss the paths forward for building machines that learn\, generalize\, and interact in more human -like ways based on more natural input.
\nRelat ed articles:
\nVong\, W. K.\, Wang\, W.\, Orhan \, A. E.\, and Lake\, B. M (2024). Grounded language acquisition through t he eyes and ears of a single child. Science\, 383.
\nOrhan\, A. E.\, and Lake\, B. M. (in press). Learning high-le vel visual representations from a child’s perspective without strong induc tive biases. Nature Machine Intelligence.
\nLake\, B. M. and Baroni\, M. (2023). Human-like systematic generalizat ion through a meta-learning neural network. Nature\, 623\, 115-121.
\nBiography
\nBrenden M . Lake is an Assistant Professor of Psychology and Data Science at New Yor k University. He received his M.S. and B.S. in Symbolic Systems from Stanf ord University in 2009\, and his Ph.D. in Cognitive Science from MIT in 20 14. He was a postdoctoral Data Science Fellow at NYU from 2014-2017. Brend en is a recipient of the Robert J. Glushko Prize for Outstanding Doctoral Dissertation in Cognitive Science\, he is a MIT Technology Review Innovato r Under 35\, and his research was selected by Scientific American as one o f the 10 most important advances of 2016. Brenden’s research focuses on co mputational problems that are easier for people than they are for machines \, such as learning new concepts\, creating new concepts\, learning-to-lea rn\, and asking questions.
\nAbstract
\nLarge language models like ChatGPT have shown extraordinary abilities for writing. While impressive at first glance\, large language models aren’t perfect and often make mist akes humans would not make. The main architecture behind ChatGPT mostly do esn’t differ from early neural networks\, and as a consequence\, carries s ome of the same limitations. My work revolves around the use of neural net works like ChatGPT mixed with symbolic methods from early AI and how these two families of methods can combine to create more robust AI. I talk abou t some of the neurosymbolic methods I used for applications in story gener ation and understanding — with the goal of eventually creating AI that can play Dungeons & Dragons. I also discuss pain points that I found for impr oving accessible communication and show how large language models can supp lement such communication.
\nBiography
\nAbstract
\nWe introduce STAR (Stream Transduction with Anchor Representations)\, a novel Transformer-based mode l designed for efficient sequence-to-sequence transduction over streams. S TAR dynamically segments input streams to create compressed anchor represe ntations\, achieving nearly lossless compression (12x) in Automatic Speech Recognition (ASR) and outperforming existing methods. Moreover\, STAR dem onstrates superior segmentation and latency-quality trade-offs in simultan eous speech-to-text tasks\, optimizing latency\, memory footprint\, and qu ality.
DTSTART;TZID=America/New_York:20240219T120000 DTEND;TZID=America/New_York:20240219T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Steven Tan “Streaming Sequence Transduction through Dynamic Compres sion” URL:https://www.clsp.jhu.edu/events/steven-tan-streaming-sequence-transduct ion-through-dynamic-compression/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,February\,Tan END:VEVENT BEGIN:VEVENT UID:ai1ec-24429@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nI discuss the application of Foundation Models in Astronomy through the collaborative efforts of the UniverseTBD consortium with a mission to democratize Science for everyone . One of our key objectives is to overcome the limitations of general-purp ose Foundation Models\, such as producing limited information in specializ ed fields. To this end\, we have developed the first specialized large lan guage model for Astronomy\, AstroLLaMa-1. This model\, enhanced by exposur e to domain-specific literature from the NASA Astrophysics Data System and ArXiv\, demonstrates improved text completion and embedding capabilities over existent GPT models. I further discuss the potential of LLMs in gener ating complex scientific hypotheses and extracting meaningful insights fro m astronomy literature. Our findings\, validated by human experts\, demons trate the LLM capability in informed scientific critique and uncover intri guing patterns in the embedding space\, highlighting the potential of LLMs to augment scientific inquiry. I will also discuss preliminary work with the multi-modal model AstroLLaVA\, which allows us to interact with astron omical images via natural language. Through the work of UniverseTBD\, we a im to explore how artificial intelligence can assist human intelligence in Astronomy and\, more broadly\, Science.
\nBiography
\nIoana Ciucă\, who goes by Jo\, is an interdisciplinary Jubilee J oint Fellow at the Australian National University\, working across the Sch ool of Computing and the Research School of Astronomy & Astrophysics. Befo re joining ANU\, Jo finished her PhD in Astrophysics at University College London in the United Kingdom\, where she worked at the intersection of As tronomy and Machine Learning to understand the formation and evolution his tory of our Galaxy\, the Milky Way. Jo is now focusing on utilizing founda tion models that benefit researchers everywhere\, working alongside the Un iverseTBD team of more than 30 astronomers\, engineers\, ML practitioners and enthusiasts worldwide.
\nDTSTART;TZID=America/New_York:20240223T120000 DTEND;TZID=America/New_York:20240223T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Ioana Ciuca (Australian National University)”A Universe To Be Decid ed: Towards Specialized Foundation Models for Advancing Astronomy” URL:https://www.clsp.jhu.edu/events/ioana-ciuca-australian-national-univers ity/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,Ciuca\,February END:VEVENT BEGIN:VEVENT UID:ai1ec-24457@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:
Abstract
\nAs artificial intelligence (AI) continues to rapidly expand into existing healthcare infrastructure – e.g.\, clinical decision support\, administrative tasks\, and public hea lth surveillance – it is perhaps more important than ever to reflect on th e broader purpose of such systems. While much focus has been on the potent ial for this technology to improve general health outcomes\, there also ex ists a significant\, but understated\, opportunity to use this technology to address health-related disparities. Accomplishing the latter depends no t only on our ability to effectively identify addressable areas of systemi c inequality and translate them into tasks that are machine learnable\, bu t also our ability to measure\, interpret\, and counteract barriers in tra ining data that may inhibit robustness to distribution shift upon deployme nt (i.e.\, new populations\, temporal dynamics). In this talk\, we will di scuss progress made along both of these dimensions. We will begin by provi ding background on the state of AI for promoting health equity. Then\, we will present results from a recent clinical phenotyping project and discus s their implication on prevailing views regarding language model robustnes s in clinical applications. Finally\, we will showcase ongoing efforts to proactively address systemic inequality in healthcare by identifying and c haracterizing stigmatizing language in medical records.
DTSTART;TZID=America/New_York:20240226T120000 DTEND;TZID=America/New_York:20240226T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Keith Harrigian (JHU) “Fighting Bias From Bias: Robust Natural Lang uage Processing Techniques to Promote Health Equity” URL:https://www.clsp.jhu.edu/events/keith-harrigian-jhu-fighting-bias-from- bias-robust-natural-language-processing-techniques-to-promote-health-equit y/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,February\,Harrigian END:VEVENT BEGIN:VEVENT UID:ai1ec-24511@www.clsp.jhu.edu DTSTAMP:20240328T213009Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20240412T120000 DTEND;TZID=America/New_York:20240412T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Sonal Joshi (JHU) URL:https://www.clsp.jhu.edu/events/sonal-joshi-jhu/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,April\,Joshi END:VEVENT END:VCALENDAR