BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-21259@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nNatural language processing has been revolutionized b y neural networks\, which perform impressively well in applications such a s machine translation and question answering. Despite their success\, neur al networks still have some substantial shortcomings: Their internal worki ngs are poorly understood\, and they are notoriously brittle\, failing on example types that are rare in their training data. In this talk\, I will use the unifying thread of hierarchical syntactic structure to discuss app roaches for addressing these shortcomings. First\, I will argue for a new evaluation paradigm based on targeted\, hypothesis-driven tests that bette r illuminate what models have learned\; using this paradigm\, I will show that even state-of-the-art models sometimes fail to recognize the hierarch ical structure of language (e.g.\, to conclude that “The book on the table is blue” implies “The table is blue.”) Second\, I will show how these beh avioral failings can be explained through analysis of models’ inductive bi ases and internal representations\, focusing on the puzzle of how neural n etworks represent discrete symbolic structure in continuous vector space. I will close by showing how insights from these analyses can be used to ma ke models more robust through approaches based on meta-learning\, structur ed architectures\, and data augmentation.\nBiography\nTom McCoy is a PhD c andidate in the Department of Cognitive Science at Johns Hopkins Universit y. As an undergraduate\, he studied computational linguistics at Yale. His research combines natural language processing\, cognitive science\, and m achine learning to study how we can achieve robust generalization in model s of language\, as this remains one of the main areas where current AI sys tems fall short. In particular\, he focuses on inductive biases and repres entations of linguistic structure\, since these are two of the major compo nents that determine how learners generalize to novel types of input. DTSTART;TZID=America/New_York:20220131T120000 DTEND;TZID=America/New_York:20220131T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Tom McCoy (Johns Hopkins University) “Opening the Black Box of Deep Learning: Representations\, Inductive Biases\, and Robustness” URL:https://www.clsp.jhu.edu/events/tom-mccoy-johns-hopkins-university-open ing-the-black-box-of-deep-learning-representations-inductive-biases-and-ro bustness/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\nNatural language processing has been revolutionized b y neural networks\, which perform impressively well in applications such a s machine translation and question answering. Despite their success\, neur al networks still have some substantial shortcomings: Their internal worki ngs are poorly understood\, and they are notoriously brittle\, failing on example types that are rare in their training data. In this talk\, I will use the unifying thread of hierarchical syntactic structure to discuss app roaches for addressing these shortcomings. First\, I will argue for a new evaluation paradigm based on targeted\, hypothesis-driven tests that bette r illuminate what models have learned\; using this paradigm\, I will show that even state-of-the-art models sometimes fail to recognize the hierarch ical structure of language (e.g.\, to conclude that “The book on the table is blue” implies “The table is blue.”) Second\, I will show how these beh avioral failings can be explained through analysis of models’ inductive bi ases and internal representations\, focusing on the puzzle of how neural n etworks represent discrete symbolic structure in continuous vector space. I will close by showing how insights from these analyses can be used to ma ke models more robust through approaches based on meta-learning\, structur ed architectures\, and data augmentation.
\nBiography
\nTom McCoy is a PhD candidate in the Department of Cognitive Sci ence at Johns Hopkins University. As an undergraduate\, he studied computa tional linguistics at Yale. His research combines natural language process ing\, cognitive science\, and machine learning to study how we can achieve robust generalization in models of language\, as this remains one of the main areas where current AI systems fall short. In particular\, he focuses on inductive biases and representations of linguistic structure\, since t hese are two of the major components that determine how learners generaliz e to novel types of input.
\n X-TAGS;LANGUAGE=en-US:2022\,January\,McCoy END:VEVENT BEGIN:VEVENT UID:ai1ec-21267@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nIn this talk\, I present a multipronged strategy for zero-shot cross-lingual Information Extraction\, that is the construction of an IE model for some target language\, given existing annotations exclu sively in some other language. This work is part of the JHU team’s effort under the IARPA BETTER program. I explore data augmentation techniques inc luding data projection and self-training\, and how different pretrained en coders impact them. We find through extensive experiments and extension of techniques that a combination of approaches\, both new and old\, leads to better performance than any one cross-lingual strategy in particular.\nBi ography\nMahsa Yarmohammadi is an assistant research scientist in CLSP\, J HU\, who leads state-of-the-art research in cross-lingual language and spe ech applications and algorithms. A primary focus of Yarmohammadi’s researc h is using deep learning techniques to transfer existing resources into ot her languages and to learn representations of language from multilingual d ata. She also works in automatic speech recognition and speech translation . Yarmohammadi received her PhD in computer science and engineering from O regon Health & Science University (2016). She joined CLSP as a post-doctor al fellow in 2017. DTSTART;TZID=America/New_York:20220204T120000 DTEND;TZID=America/New_York:20220204T131500 LOCATION:Ames 234 Presented Virtually via Zoom https://wse.zoom.us/j/967351 83473 SEQUENCE:0 SUMMARY:Mahsa Yarmohammadi (Johns Hopkins University) “Data Augmentation fo r Zero-shot Cross-Lingual Information Extraction” URL:https://www.clsp.jhu.edu/events/mahsa-yarmohammadi-johns-hopkins-univer sity-data-augmentation-for-zero-shot-cross-lingual-information-extraction/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nIn this talk\, I present a multipronged strategy for zero-shot cross-lingual Information Extraction\, that is the construction of an IE model for some target language\, given existing annotations exclu sively in some other language. This work is part of the JHU team’s effort under the IARPA BETTER program. I explore data augmentation techniques inc luding data projection and self-training\, and how different pretrained en coders impact them. We find through extensive experiments and extension of techniques that a combination of approaches\, both new and old\, leads to better performance than any one cross-lingual strategy in particular.
\nBiography
\nAbstr act
\nSocial media allows researchers to track societal and cultural changes over time based on language analysis tools. Many of thes e tools rely on statistical algorithms which need to be tuned to specific types of language. Recent studies have questioned the robustness of longit udinal analyses based on statistical methods due to issues of temporal bia s and semantic shift. To what extent are changes in semantics over time af fecting the reliability of longitudinal analyses? We examine this question through a case study: understanding shifts in mental health during the co urse of the COVID-19 pandemic. We demonstrate that a recently-introduced m ethod for measuring semantic shift may be used to proactively identify fai lure points of language-based models and improve predictive generalization over time. Ultimately\, we find that these analyses are critical to produ cing accurate longitudinal studies of social media.
\n X-TAGS;LANGUAGE=en-US:2022\,February\,Harrigian END:VEVENT BEGIN:VEVENT UID:ai1ec-21275@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract\n\n\n\nAutomatic discovery of phone or word-like units is one of the core objectives in zero-resource speech processing. Recent attempts employ contrastive predictive coding (CPC)\, where the model lear ns representations by predicting the next frame given past context. Howeve r\, CPC only looks at the audio signal’s structure at the frame level. The speech structure exists beyond frame-level\, i.e.\, at phone level or eve n higher. We propose a segmental contrastive predictive coding (SCPC) fram ework to learn from the signal structure at both the frame and phone level s.\n\nSCPC is a hierarchical model with three stages trained in an end-to- end manner. In the first stage\, the model predicts future feature frames and extracts frame-level representation from the raw waveform. In the seco nd stage\, a differentiable boundary detector finds variable-length segmen ts. In the last stage\, the model predicts future segments to learn segmen t representations. Experiments show that our model outperforms existing ph one and word segmentation methods on TIMIT and Buckeye datasets. DTSTART;TZID=America/New_York:20220211T120000 DTEND;TZID=America/New_York:20220211T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Saurabhchand Bhati “Segmental Contrastive Predict ive Coding for Unsupervised Acoustic Segmentation” URL:https://www.clsp.jhu.edu/events/student-seminar-saurabhchand-bhati/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\n\n\n\n\nAutomatic discovery of phone or word-like units is one of the core objectives in zero-resource speech processing. Recent attempts employ contrastive predictive coding (CPC)\, where the model learns repre sentations by predicting the next frame given past context. However\, CPC only looks at the audio signal’s structure at the frame level. The speech structure exists beyond frame-level\, i.e.\, at phone level or even higher . We propose a segmental contrastive predictive coding (SCPC) framework to learn from the signal structure at both the frame and phone levels.\n\n\nSCPC is a hierarchical mode l with three stages trained in an end-to-end manner. In the first stage\, the model predicts future feature frames and extracts frame-level represen tation from the raw waveform. In the second stage\, a differentiable bound ary detector finds variable-length segments. In the last stage\, the model predicts future segments to learn segment representations. Experiments sh ow that our model outperforms existing phone and word segmentation methods on TIMIT and Buckeye datasets.
Abstr act
\nAs humans\, our understanding of language is grounded
in a rich mental model about “how the world works” – that we learn throug
h perception and interaction. We use this understanding to reason beyond w
hat we literally observe or read\, imagining how situations might unfold i
n the world. Machines today struggle at this kind of reasoning\, which lim
its how they can communicate with humans.
In my talk\, I will discuss three lines of work to bridge
this gap between machines and humans. I will first discuss how we might m
easure grounded understanding. I will introduce a suite of approaches for
constructing benchmarks\, using machines in the loop to filter out spuriou
s biases. Next\, I will introduce PIGLeT: a model that learns physical com
monsense understanding by interacting with the world through simulation\,
using this knowledge to ground language. From an English-language descript
ion of an event\, PIGLeT can anticipate how the world state might change –
outperforming text-only models that are orders of magnitude larger. Final
ly\, I will introduce MERLOT\, which learns about situations in the world
by watching millions of YouTube videos with transcribed speech. Through tr
aining objectives inspired by the developmental psychology idea of multimo
dal reentry\, MERLOT learns to fuse language\, vision\, and sound together
into powerful representations.
Together\, these directions suggest a path forward for building mac
hines that learn language rooted in the world.
Biography strong>
\nRowan Zellers is a final year PhD candidate at the Univers ity of Washington in Computer Science & Engineering\, advised by Yejin Cho i and Ali Farhadi. His research focuses on enabling machines to understand language\, vision\, sound\, and the world beyond these modalities. He has been recognized through an NSF Graduate Fellowship and a NeurIPS 2021 out standing paper award. His work has appeared in several media outlets\, inc luding Wired\, the Washington Post\, and the New York Times. In the past\, he graduated from Harvey Mudd College with a B.S. in Computer Science & M athematics\, and has interned at the Allen Institute for AI.
\n< /HTML> X-TAGS;LANGUAGE=en-US:2022\,February\,Zellers END:VEVENT BEGIN:VEVENT UID:ai1ec-21280@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nAs AI-driven language interfaces (such as chat-bots) become more integrated into our lives\, they need to become more versatile and reliable in their communication with human users. How can we make pro gress toward building more “general” models that are capable of understand ing a broader spectrum of language commands\, given practical constraints such as the limited availability of labeled data?\nIn this talk\, I will d escribe my research toward addressing this question along two dimensions o f generality. First I will discuss progress in “breadth” — models that add ress a wider variety of tasks and abilities\, drawing inspiration from exi sting statistical learning techniques such as multi-task learning. In part icular\, I will showcase a system that works well on several QA benchmarks \, resulting in state-of-the-art results on 10 benchmarks. Furthermore\, I will show its extension to tasks beyond QA (such as text generation or cl assification) that can be “defined” via natural language. In the second p art\, I will focus on progress in “depth” — models that can handle complex inputs such as compositional questions. I will introduce Text Modular Net works\, a general framework that casts problem-solving as natural language communication among simpler “modules.” Applying this framework to composi tional questions by leveraging discrete optimization and existing non-comp ositional closed-box QA models results in a model with strong empirical pe rformance on multiple complex QA benchmarks while providing human-readable reasoning.\nI will conclude with future research directions toward broade r NLP systems by addressing the limitations of the presented ideas and oth er missing elements needed to move toward more general-purpose interactive language understanding systems.\nBiography\nDaniel Khashabi is a postdoct oral researcher at the Allen Institute for Artificial Intelligence (AI2)\, Seattle. Previously\, he completed his Ph.D. in Computer and Information Sciences at the University of Pennsylvania in 2019. His interests lie at t he intersection of artificial intelligence and natural language processing \, with a vision toward more general systems through unified algorithms an d theories. DTSTART;TZID=America/New_York:20220218T120000 DTEND;TZID=America/New_York:20220218T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Daniel Khashabi (Allen Institute for Artificial Intelligence) “The Quest Toward Generality in Natural Language Understanding” URL:https://www.clsp.jhu.edu/events/daniel-khashabi-allen-institute-for-art ificial-intelligence/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nAs AI-driven language interfaces (such as c hat-bots) become more integrated into our lives\, they need to become more versatile and reliable in their communication with human users. How can w e make progress toward building more “general” models that are capable of understanding a broader spectrum of language commands\, given practical co nstraints such as the limited availability of labeled data?
\nIn this talk\, I will describe my research toward addressing this ques tion along two dimensions of generality. First I will discuss progress in “breadth” — models that address a wider variety of tasks and abilities\, d rawing inspiration from existing statistical learning techniques such as m ulti-task learning. In particular\, I will showcase a system that works we ll on several QA benchmarks\, resulting in state-of-the-art results on 10 benchmarks. Furthermore\, I will show its extension to tasks beyond QA (su ch as text generation or classification) that can be “defined” via natural language. In the second part\, I will focus on progress in “depth” — mod els that can handle complex inputs such as compositional questions. I will introduce Text Modular Networks\, a general framework that casts problem- solving as natural language communication among simpler “modules.” Applyin g this framework to compositional questions by leveraging discrete optimiz ation and existing non-compositional closed-box QA models results in a mod el with strong empirical performance on multiple complex QA benchmarks whi le providing human-readable reasoning.
\nI will conclude w ith future research directions toward broader NLP systems by addressing th e limitations of the presented ideas and other missing elements needed to move toward more general-purpose interactive language understanding system s.
\nBiography
\nDaniel Khashabi is a postdoctoral researcher at the Allen Institute for Artificia l Intelligence (AI2)\, Seattle. Previously\, he completed his Ph.D. in Com puter and Information Sciences at the University of Pennsylvania in 2019. His interests lie at the intersection of artificial intelligence and natur al language processing\, with a vision toward more general systems through unified algorithms and theories.
\n X-TAGS;LANGUAGE=en-US:2022\,February\,Khashabi END:VEVENT BEGIN:VEVENT UID:ai1ec-21487@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nEnormous amounts of ever-changing knowledge are avai lable online in diverse textual styles and diverse formats. Recent advance s in deep learning algorithms and large-scale datasets are spurring progre ss in many Natural Language Processing (NLP) tasks\, including question an swering. Nevertheless\, these models cannot scale up when task-annotated t raining data are scarce. This talk presents my lab’s work toward building general-purpose models in NLP and how to systematically evaluate them. Fir st\, I present a general model for two known tasks of question answering i n English and multiple languages that are robust to small domain shifts. Then\, I show a meta-training approach that can solve a variety of NLP tas ks with only using a few examples and introduce a benchmark to evaluate cr oss-task generalization. Finally\, I discuss neuro-symbolic approaches to address more complex tasks by eliciting knowledge from structured data and language models.\n\nBiography\n\nHanna Hajishirzi is an Assistant Profess or in the Paul G. Allen School of Computer Science & Engineering at the Un iversity of Washington and a Senior Research Manager at the Allen Institut e for AI. Her research spans different areas in NLP and AI\, focusing on d eveloping general-purpose machine learning algorithms that can solve many NLP tasks. Applications for these algorithms include question answering\, representation learning\, green AI\, knowledge extraction\, and conversati onal dialogue. Honors include the NSF CAREER Award\, Sloan Fellowship\, Al len Distinguished Investigator Award\, Intel rising star award\, best pape r and honorable mention awards\, and several industry research faculty awa rds. Hanna received her PhD from University of Illinois and spent a year a s a postdoc at Disney Research and CMU. DTSTART;TZID=America/New_York:20220225T120000 DTEND;TZID=America/New_York:20220225T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 SEQUENCE:0 SUMMARY:Hanna Hajishirzi (University of Washington & Allen Institute for AI ) “Toward Robust\, Knowledge-Rich NLP” URL:https://www.clsp.jhu.edu/events/hanna-hajishirzi-university-of-washingt on-allen-institute-for-ai-toward-robust-knowledge-rich-nlp/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nAbstr act
\nSince it is increasingly harder to opt out from inter acting with AI technology\, people demand that AI is capable of maintainin g contracts such that it supports agency and oversight of people who are r equired to use it or who are affected by it. To help those people create a mental model about how to interact with AI systems\, I extend the underly ing models to self-explain—predict the label/answer and explain this predi ction. In this talk\, I will present how to generate (1) free-text explana tions given in plain English that immediately tell users the gist of the r easoning\, and (2) contrastive explanations that help users understand how they could change the text to get another label.
\nBiograph y
\nAna Marasović is a postdoctoral researcher at the Allen Institute for AI (AI2) and the Paul G. Allen School of Computer Science & Engineering at University of Washington. Her research interests broadly l ie in the fields of natural language processing\, explainable AI\, and vis ion-and-language learning. Her projects are motivated by a unified goal: i mprove interaction and control of the NLP systems to help people make thes e systems do what they want with the confidence that they’re getting exact ly what they need. Prior to joining AI2\, Ana obtained her PhD from Heidel berg University.
\nHow to pronounce my name: the first name i s Ana like in Spanish\, i.e.\, with a long “a” like in “water”\; regarding the last name: “mara” as in actress mara wilson + “so” + “veetch”.
\n< /BODY> X-TAGS;LANGUAGE=en-US:2022\,February\,Marasovic END:VEVENT BEGIN:VEVENT UID:ai1ec-21494@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract\nAdversarial attacks deceive neural network systems by adding carefully crafted perturbations to benign signals. Being almost im perceptible to humans\, these attacks pose a severe security threat to the state-of-the-art speech and speaker recognition systems\, making it vital to propose countermeasures against them. In this talk\, we focus on 1) cl assification of a given adversarial attack into attack algorithm type\, th reat model type\, and signal-to-adversarial-noise ratios\, 2) developing a novel speech denoising solution to further improve the classification per formance. \nOur proposed approach uses an x-vector network as a signature extractor to get embeddings\, which we call signatures. These signatures c ontain information about the attack and can help classify different attack algorithms\, threat models\, and signal-to-adversarial-noise ratios. We d emonstrate the transferability of such signatures to other tasks. In parti cular\, a signature extractor trained to classify attacks against speaker identification can also be used to classify attacks against speaker verifi cation and speech recognition. We also show that signatures can be used to detect unknown attacks i.e. attacks not included during training. Lastly \, we propose to improve the signature extractor by making the job of the signature extractor easier by removing the clean signal from the adversari al example (which consists of clean signal+perturbation). We train our sig nature extractor using adversarial perturbation. At inference time\, we us e a time-domain denoiser to obtain adversarial perturbation from adversari al examples. Using our improved approach\, we show that common attacks in the literature (Fast Gradient Sign Method (FGSM)\, Projected Gradient Desc ent (PGD)\, Carlini-Wagner (CW) ) can be classified with accuracy as high as 96%. We also detect unknown attacks with an equal error rate (EER) of a bout 9%\, which is very promising. DTSTART;TZID=America/New_York:20220304T120000 DTEND;TZID=America/New_York:20220304T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Sonal Joshi “Classify and Detect Adversarial Atta cks Against Speaker and Speech Recognition Systems” URL:https://www.clsp.jhu.edu/events/student-seminar-sonal-joshi/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nAdversarial attacks deceive neural network systems by adding carefully crafted perturbations to benign signals. Being almost imperceptible to humans\, these attacks pose a severe security thr eat to the state-of-the-art speech and speaker recognition systems\, makin g it vital to propose countermeasures against them. In this talk\, we focu s on 1) classification of a given adversarial attack into attack algorithm type\, threat model type\, and signal-to-adversarial-noise ratios\, 2) de veloping a novel speech denoising solution to further improve the classifi cation performance.
\nOur proposed approach uses a n x-vector network as a signature extractor to get embeddings\, which we c all signatures. These signatures contain information about the attack and can help classify different attack algorithms\, threat models\, and signal -to-adversarial-noise ratios. We demonstrate the transferability of such s ignatures to other tasks. In particular\, a signature extractor trained to classify attacks against speaker identification can also be used to class ify attacks against speaker verification and speech recognition. We also s how that signatures can be used to detect unknown attacks i.e. attacks not included during training. Lastly\, we propose to improve the signature e xtractor by making the job of the signature extractor easier by removing t he clean signal from the adversarial example (which consists of clean sign al+perturbation). We train our signature extractor using adversarial pertu rbation. At inference time\, we use a time-domain denoiser to obtain adver sarial perturbation from adversarial examples. Using our improved approach \, we show that common attacks in the literature (Fast Gradient Sign Metho d (FGSM)\, Projected Gradient Descent (PGD)\, Carlini-Wagner (CW) ) can be classified with accuracy as high as 96%. We also detect unknown attacks w ith an equal error rate (EER) of about 9%\, which is very promising.
\n X-TAGS;LANGUAGE=en-US:2022\,Joshi\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-21615@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract\n\n\nWe consider a problem of data collection for sema ntically rich NLU tasks\, where detailed semantics of documents (or uttera nces) are captured using a complex meaning representation. Previously\, d ata collection for such tasks was either handled at the cost of extensive annotator training (e.g. in FrameNet or PropBank) or simplified meaning re presentation (e.g. in QA-SRL or Overnight). In this talk\, we present two systems [1\, 2] that aim to support fast\, accurate\, and expressive sema ntic annotations by pairing human workers with a trained model in the loop .\n\nThe first system\, called Guided K-best [1]\, is an annotation toolki t for conversational semantic parsing. Instead of typing annotations from scratch\, data specialists choose a correct parse from the K-best output of a few-shot prototyped model. As the K-best list can be large (e.g. K=1 00)\, we guide the annotators’ exploration of the K-best list via explaina ble hierarchical clustering. In addition\, we experiment with RoBERTa-bas ed reranking of the K-best list to recalibrate the few-shot model towards Accuracy@K. The final system allows to annotate data up to 35% faster tha n the standard\, non-guided K-best and improves the few-shot model’s top-1 accuracy by up to 18%. The second system\, called SchemaBlocks [2]\, is an annotation toolkit for schemas\, or structured descriptions of frequent real-world scenarios (e.g.\, cooking a meal). It represents schemas in t he annotation UI as nested blocks. Using a novel Causal ARM model\, we fu rther speed up the annotation process and guide data specialists towards e xpressive and diverse schemas. As part of this work\, we collect 232 sche mas\, evaluating their internal coherence and their coverage on large-scal e newswire corpora.\n\n\n DTSTART;TZID=America/New_York:20220311T120000 DTEND;TZID=America/New_York:20220311T131500 LOCATION:Virtual Seminar SEQUENCE:0 SUMMARY:Student Seminar – Anton Belyy “Systems for Human-AI Cooperation on Collecting Semantic Annotations” URL:https://www.clsp.jhu.edu/events/student-seminar-anton-belyy-systems-for -human-ai-cooperation-on-collecting-semantic-annotations/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\n\n X-TAGS;LANGUAGE=en-US:2022\,Belyy\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-21621@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nSystems that support expressive\, situated natural la nguage interactions are essential for expanding access to complex computin g systems\, such as robots and databases\, to non-experts. Reasoning and l earning in such natural language interactions is a challenging open proble m. For example\, resolving sentence meaning requires reasoning not only ab out word meaning\, but also about the interaction context\, including the history of the interaction and the situated environment. In addition\, the sequential dynamics that arise between user and system in and across inte ractions make learning from static data\, i.e.\, supervised data\, both ch allenging and ineffective. However\, these same interaction dynamics resul t in ample opportunities for learning from implicit and explicit feedback that arises naturally in the interaction. This lays the foundation for sys tems that continually learn\, improve\, and adapt their language use throu gh interaction\, without additional annotation effort. In this talk\, I wi ll focus on these challenges and opportunities. First\, I will describe ou r work on modeling dependencies between language meaning and interaction c ontext when mapping natural language in interaction to executable code. In the second part of the talk\, I will describe our work on language unders tanding and generation in collaborative interactions\, focusing on continu al learning from explicit and implicit user feedback.\nBiography\nAlane Su hr is a PhD Candidate in the Department of Computer Science at Cornell Uni versity\, advised by Yoav Artzi. Her research spans natural language proc essing\, machine learning\, and computer vision\, with a focus on building systems that participate and continually learn in situated natural langua ge interactions with human users. Alane’s work has been recognized by pape r awards at ACL and NAACL\, and has been supported by fellowships and gran ts\, including an NSF Graduate Research Fellowship\, a Facebook PhD Fellow ship\, and research awards from AI2\, ParlAI\, and AWS. Alane has also co- organized multiple workshops and tutorials appearing at NeurIPS\, EMNLP\, NAACL\, and ACL. Previously\, Alane received a BS in Computer Science and Engineering as an Eminence Fellow at the Ohio State University. DTSTART;TZID=America/New_York:20220314T120000 DTEND;TZID=America/New_York:20220314T131500 LOCATION:Virtual Seminar SEQUENCE:0 SUMMARY:Alane Suhr (Cornell University) “Reasoning and Learning in Interact ive Natural Language Systems” URL:https://www.clsp.jhu.edu/events/alane-suhr-cornell-university-reasoning -and-learning-in-interactive-natural-language-systems/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\n
Abstr act
\nSystems that support expressive\, situated natural la nguage interactions are essential for expanding access to complex computin g systems\, such as robots and databases\, to non-experts. Reasoning and l earning in such natural language interactions is a challenging open proble m. For example\, resolving sentence meaning requires reasoning not only ab out word meaning\, but also about the interaction context\, including the history of the interaction and the situated environment. In addition\, the sequential dynamics that arise between user and system in and across inte ractions make learning from static data\, i.e.\, supervised data\, both ch allenging and ineffective. However\, these same interaction dynamics resul t in ample opportunities for learning from implicit and explicit feedback that arises naturally in the interaction. This lays the foundation for sys tems that continually learn\, improve\, and adapt their language use throu gh interaction\, without additional annotation effort. In this talk\, I wi ll focus on these challenges and opportunities. First\, I will describe ou r work on modeling dependencies between language meaning and interaction c ontext when mapping natural language in interaction to executable code. In the second part of the talk\, I will describe our work on language unders tanding and generation in collaborative interactions\, focusing on continu al learning from explicit and implicit user feedback.
\nBiog raphy
\nAlane Suhr is a PhD Candidate in the Department of Computer Science at Cornell University\, advised by Yoav Artzi. Her resea rch spans natural language processing\, machine learning\, and computer vi sion\, with a focus on building systems that participate and continually l earn in situated natural language interactions with human users. Alane’s w ork has been recognized by paper awards at ACL and NAACL\, and has been su pported by fellowships and grants\, including an NSF Graduate Research Fel lowship\, a Facebook PhD Fellowship\, and research awards from AI2\, ParlA I\, and AWS. Alane has also co-organized multiple workshops and tutorials appearing at NeurIPS\, EMNLP\, NAACL\, and ACL. Previously\, Alane receive d a BS in Computer Science and Engineering as an Eminence Fellow at the Oh io State University.
\n X-TAGS;LANGUAGE=en-US:2022\,March\,Suhr END:VEVENT BEGIN:VEVENT UID:ai1ec-21616@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract\nSocial media allows researchers to track societal and cultural changes over time based on language analysis tools. Many of thes e tools rely on statistical algorithms which need to be tuned to specific types of language. Recent studies have shown the absence of appropriate tu ning\, specifically in the presence of semantic shift\, can hinder robustn ess of the underlying methods. However\, little is known about the practic al effect this sensitivity may have on downstream longitudinal analyses. W e explore this gap in the literature through a timely case study: understa nding shifts in depression during the course of the COVID-19 pandemic. We find that inclusion of only a small number of semantically-unstable featur es can promote significant changes in longitudinal estimates of our target outcome. At the same time\, we demonstrate that a recently-introduced met hod for measuring semantic shift may be used to proactively identify failu re points of language-based models and\, in turn\, improve predictive gene ralization. DTSTART;TZID=America/New_York:20220318T120000 DTEND;TZID=America/New_York:20220318T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Keith Harrigian “The Problem of Semantic Shift in Longitudinal Monitoring of Social Media” URL:https://www.clsp.jhu.edu/events/student-seminar-keith-harrigian-the-pro blem-of-semantic-shift-in-longitudinal-monitoring-of-social-media/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nSocial media allows researchers to track societal and cultural changes over time based on language analysis tools. Many of thes e tools rely on statistical algorithms which need to be tuned to specific types of language. Recent studies have shown the absence of appropriate tu ning\, specifically in the presence of semantic shift\, can hinder robustn ess of the underlying methods. However\, little is known about the practic al effect this sensitivity may have on downstream longitudinal analyses. W e explore this gap in the literature through a timely case study: understa nding shifts in depression during the course of the COVID-19 pandemic. We find that inclusion of only a small number of semantically-unstable featur es can promote significant changes in longitudinal estimates of our target outcome. At the same time\, we demonstrate that a recently-introduced met hod for measuring semantic shift may be used to proactively identify failu re points of language-based models and\, in turn\, improve predictive gene ralization.
\n X-TAGS;LANGUAGE=en-US:2022\,Harrigian\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-21497@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nWhile the “deep learning tsunami” continues to define the state of the art in speech and language processing\, finite-state tra nsducer grammars developed by linguists and engineers are still widely use d in industrial\, highly-multilingual settings\, particularly for symbolic \, “front-end” speech applications. In this talk\, I will first briefly re view the current state of the OpenFst and OpenGrm finite-state transducer libraries. I then review two “late-breaking” algorithms found in these lib raries. The first is a heuristic but highly-effective general-purpose opti mization routine for weighted transducers. The second is an algorithm for computing the single shortest string of non-deterministic weighted accepto rs which lack certain properties required by classic shortest-path algorit hms. I will then illustrate how the OpenGrm tools can be used to induce a finite-state string-to-string transduction model known as a pair n-gram mo del. This model has been applied to grapheme-to-phoneme conversion\, loanw ord detection\, abbreviation expansion\, and back-transliteration\, among other tasks.\nBiography\nKyle Gorman is an assistant professor of linguist ics at the Graduate Center\, City University of New York\, and director of the master’s program in computational linguistics\; he is also a software engineer in the speech and language algorithms group at Google. With Rich ard Sproat\, he is the coauthor of Finite-State Text Processing (Morgan & Claypool\, 2021) and the creator of Pynini\, a finite-state text processin g library for Python. He has also published on statistical methods for com paring computational models\, text normalization\, grapheme-to-phoneme con version\, and morphological analysis\, as well as many topics in linguisti c theory. DTSTART;TZID=America/New_York:20220401T120000 DTEND;TZID=America/New_York:20220401T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Kyle Gorman (City University of New York) ” Weighted Finite-State T ransducers: The Later Years” URL:https://www.clsp.jhu.edu/events/kyle-gorman-city-university-of-new-york -weighted-finite-state-transducers-the-later-years/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nWhile the “deep learning tsunami” continues to define the state of the art in speech and language processing\, finite-state tra nsducer grammars developed by linguists and engineers are still widely use d in industrial\, highly-multilingual settings\, particularly for symbolic \, “front-end” speech applications. In this talk\, I will first briefly re view the current state of the OpenFst and OpenGrm finite-state transducer libraries. I then review two “late-breaking” algorithms found in these lib raries. The first is a heuristic but highly-effective general-purpose opti mization routine for weighted transducers. The second is an algorithm for computing the single shortest string of non-deterministic weighted accepto rs which lack certain properties required by classic shortest-path algorit hms. I will then illustrate how the OpenGrm tools can be used to induce a finite-state string-to-string transduction model known as a pair n-gram mo del. This model has been applied to grapheme-to-phoneme conversion\, loanw ord detection\, abbreviation expansion\, and back-transliteration\, among other tasks.
\nBiography
\nKyle Gorman is an assistant professor of linguistics at the Graduate Center\, City Universit y of New York\, and director of the master’s program in computational ling uistics\; he is also a software engineer in the speech and language algori thms group at Google. With Richard Sproat\, he is the coauthor of Finit e-State Text Processing (Morgan & Claypool\, 2021) and the creator of Pynini\, a finite-state text processing library for Python. He has also pu blished on statistical methods for comparing computational models\, text n ormalization\, grapheme-to-phoneme conversion\, and morphological analysis \, as well as many topics in linguistic theory.
\n X-TAGS;LANGUAGE=en-US:2022\,Gorman\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-22374@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nIn recent years\, the field of Natural Language Proce ssing has seen a profusion of tasks\, datasets\, and systems that facilita te reasoning about real-world situations through language (e.g.\, RTE\, MN LI\, COMET). Such systems might\, for example\, be trained to consider a s ituation where “somebody dropped a glass on the floor\,” and conclude it i s likely that “the glass shattered” as a result. In this talk\, I will dis cuss three pieces of work that revisit assumptions made by or about these systems. In the first work\, I develop a Defeasible Inference task\, which enables a system to recognize when a prior assumption it has made may no longer be true in light of new evidence it receives. The second work I wil l discuss revisits partial-input baselines\, which have highlighted issues of spurious correlations in natural language reasoning datasets and led t o unfavorable assumptions about models’ reasoning abilities. In particular \, I will discuss experiments that show models may still learn to reason i n the presence of spurious dataset artifacts. Finally\, I will touch on wo rk analyzing harmful assumptions made by reasoning models in the form of s ocial stereotypes\, particularly in the case of free-form generative reaso ning models.\nBiography\nRachel Rudinger is an Assistant Professor in the Department of Computer Science at the University of Maryland\, College Par k. She holds joint appointments in the Department of Linguistics and the I nstitute for Advanced Computer Studies (UMIACS). In 2019\, Rachel complete d her Ph.D. in Computer Science at Johns Hopkins University in the Center for Language and Speech Processing. From 2019-2020\, she was a Young Inves tigator at the Allen Institute for AI in Seattle\, and a visiting research er at the University of Washington. Her research interests include computa tional semantics\, common-sense reasoning\, and issues of social bias and fairness in NLP. DTSTART;TZID=America/New_York:20220916T120000 DTEND;TZID=America/New_York:20220916T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rachel Rudinger (University of Maryland\, College Park) “Not So Fas t!: Revisiting Assumptions in (and about) Natural Language Reasoning” URL:https://www.clsp.jhu.edu/events/rachel-rudinger-university-of-maryland- college-park-not-so-fast-revisiting-assumptions-in-and-about-natural-langu age-reasoning/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nIn recent years\, the field of Natural Language Proce ssing has seen a profusion of tasks\, datasets\, and systems that facilita te reasoning about real-world situations through language (e.g.\, RTE\, MN LI\, COMET). Such systems might\, for example\, be trained to consider a s ituation where “somebody dropped a glass on the floor\,” and conclude it i s likely that “the glass shattered” as a result. In this talk\, I will dis cuss three pieces of work that revisit assumptions made by or about these systems. In the first work\, I develop a Defeasible Inference task\, which enables a system to recognize when a prior assumption it has made may no longer be true in light of new evidence it receives. The second work I wil l discuss revisits partial-input baselines\, which have highlighted issues of spurious correlations in natural language reasoning datasets and led t o unfavorable assumptions about models’ reasoning abilities. In particular \, I will discuss experiments that show models may still learn to reason i n the presence of spurious dataset artifacts. Finally\, I will touch on wo rk analyzing harmful assumptions made by reasoning models in the form of s ocial stereotypes\, particularly in the case of free-form generative reaso ning models.
\nBiography
\nRachel Rudinger is an Assistant Professor in the Department of Computer Science at the Unive rsity of Maryland\, College Park. She holds joint appointments in the Depa rtment of Linguistics and the Institute for Advanced Computer Studies (UMI ACS). In 2019\, Rachel completed her Ph.D. in Computer Science at Johns Ho pkins University in the Center for Language and Speech Processing. From 20 19-2020\, she was a Young Investigator at the Allen Institute for AI in Se attle\, and a visiting researcher at the University of Washington. Her res earch interests include computational semantics\, common-sense reasoning\, and issues of social bias and fairness in NLP.
\n X-TAGS;LANGUAGE=en-US:2022\,Rudinger\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-22375@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nI will present our work on data augmentation using st yle transfer as a way to improve domain adaptation in sequence labeling ta sks. The target domain is social media data\, and the task is named entity recognition (NER). The premise is that we can transform the labelled out of domain data into something that stylistically is more closely related t o the target data. Then we can train a model on a combination of the gener ated data and the smaller amount of in domain data to improve NER predicti on performance. I will show recent empirical results on these efforts.\nIf time allows\, I will also give an overview of other research projects I’m currently leading at RiTUAL (Research in Text Understanding and Analysis of Language) lab. The common thread among all these research problems is t he scarcity of labeled data.\nBiography\nThamar Solorio is a Professor of Computer Science at the University of Houston (UH). She holds graduate deg rees in Computer Science from the Instituto Nacional de Astrofísica\, Ópti ca y Electrónica\, in Puebla\, Mexico. Her research interests include info rmation extraction from social media data\, enabling technology for code-s witched data\, stylistic modeling of text\, and more recently multimodal a pproaches for online content understanding. She is the director and founde r of the RiTUAL Lab at UH. She is the recipient of an NSF CAREER award for her work on authorship attribution\, and recipient of the 2014 Emerging L eader ABIE Award in Honor of Denice Denton. She is currently serving a sec ond term as an elected board member of the North American Chapter of the A ssociation of Computational Linguistics and was PC co-chair for NAACL 2019 . She recently joined the team of Editors in Chief for the ACL Rolling Rev iew (ARR) system. Her research is currently funded by the NSF and by ADOBE . DTSTART;TZID=America/New_York:20220923T120000 DTEND;TZID=America/New_York:20220923T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Thamar Solorio (University of Houston) “Style Transfer for Data Aug mentation in Sequence Labeling Tasks” URL:https://www.clsp.jhu.edu/events/thamar-solorio-university-of-houston-st yle-transfer-for-data-augmentation-in-sequence-labeling-tasks/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nI will present our work on data a ugmentation using style transfer as a way to improve domain adaptation in sequence labeling tasks. The target domain is social media data\, and the task is named entity recognition (NER). The premise is that we can transfo rm the labelled out of domain data into something that stylistically is mo re closely related to the target data. Then we can train a model on a comb ination of the generated data and the smaller amount of in domain data to improve NER prediction performance. I will show recent empirical results o n these efforts.
\nIf time allows\, I will also give an overview of other research projects I’m currently leading at RiTUA L (Research in Text Understanding and Analysis of Language) lab. The commo n thread among all these research problems is the scarcity of labeled data .
\nBiography
\nThamar Solorio is a Professor of Computer Science at the Univer sity of Houston (UH). She holds graduate degrees in Computer Science from the Instituto Nacional de Astrofísica\, Óptica y Electrónica\, in Puebla\, Mexico. Her research interests include information extraction from social media data\, enabling technology for code-switched data\, stylistic model ing of text\, and more recently multimodal approaches for online content u nderstanding. She is the director and founder of the RiTUAL Lab at UH. She is the recipient of an NSF CAREER award for her work on authorship attrib ution\, and recipient of the 2014 Emerging Leader ABIE Award in Honor of D enice Denton. She is currently serving a second term as an elected board m ember of the North American Chapter of the Association of Computational Li nguistics and was PC co-chair for NAACL 2019. She recently joined the team of Editors in Chief for the ACL Rolling Review (ARR) system. Her research is currently funded by the NSF and by ADOBE.
\n X-TAGS;LANGUAGE=en-US:2022\,September\,Solorio END:VEVENT BEGIN:VEVENT UID:ai1ec-22380@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nThe availability of large multilingual pre-trained la nguage models has opened up exciting pathways for developing NLP technolog ies for languages with scarce resources. In this talk I will advocate for the need to go beyond the most common languages in multilingual evaluation \, and on the challenges of handling new\, unseen-during-training language s and varieties. I will also share some of my experiences with working wit h indigenous and other endangered language communities and activists.\nBio graphy\n\nAntonios Anastasopoulos is an Assistant Professor in Computer Sc ience at George Mason University. In 2019\, Antonis received his PhD in Co mputer Science from the University of Notre Dame and then worked as a post doctoral researcher at the Language Technologies Institute at Carnegie Mel lon University. His research interests revolve around computational lingui stics and natural language processing with a focus on low-resource setting s\, endangered languages\, and cross-lingual learning.\n\n\n DTSTART;TZID=America/New_York:20220930T120000 DTEND;TZID=America/New_York:20220930T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Antonios Anastasopoulos (George Mason University) “NLP Beyond the T op-100 Languages” URL:https://www.clsp.jhu.edu/events/antonis-anastasopoulos-george-mason-uni versity/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nThe availability of large multilingual pre-trained la nguage models has opened up exciting pathways for developing NLP technolog ies for languages with scarce resources. In this talk I will advocate for the need to go beyond the most common languages in multilingual evaluation \, and on the challenges of handling new\, unseen-during-training language s and varieties. I will also share some of my experiences with working wit h indigenous and other endangered language communities and activists.
\nBiography
\nAntonios Anastasopoulos is an Assistant Professor in Compu ter Science at George Mason University. In 2019\, Antonis received his PhD in Computer Science from the University of Notre Dame and then worked as a postdoctoral researcher at the Language Technologies Institute at Carneg ie Mellon University. His research interests revolve around computational linguistics and natural language processing with a focus on low-resource s ettings\, endangered languages\, and cross-lingual learning.
\n\n X-TAGS;LANGUAGE=en-US:2022\,Anastasopoulos\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-22423@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20221007T120000 DTEND;TZID=America/New_York:20221007T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Ariya Rastrow (Amazon) URL:https://www.clsp.jhu.edu/events/ariya-rastrow-amazon-2/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,October\,Rastrow END:VEVENT BEGIN:VEVENT UID:ai1ec-22394@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\n\nModel robustness and spurious correlations have rec eived increasing attention in the NLP community\, both in methods and eval uation. The term “spurious correlation” is overloaded though and can refer to any undesirable shortcuts learned by the model\, as judged by domain e xperts.\n\n\nWhen designing mitigation algorithms\, we often (implicitly) assume that a spurious feature is irrelevant for prediction. However\, man y features in NLP (e.g. word overlap and negation) are not spurious in the sense that the background is spurious for classifying objects in an image . In contrast\, they carry important information that’s needed to make pre dictions by humans. In this talk\, we argue that it is more productive to characterize features in terms of their necessity and sufficiency for pred iction. We then discuss the implications of this categorization in represe ntation\, learning\, and evaluation.\nBiography\nHe He is an Assistant Pro fessor in the Department of Computer Science and the Center for Data Scien ce at New York University. She obtained her PhD in Computer Science at the University of Maryland\, College Park. Before joining NYU\, she spent a y ear at AWS AI and was a post-doc at Stanford University before that. She i s interested in building robust and trustworthy NLP systems in human-cente red settings. Her recent research focus includes robust language understan ding\, collaborative text generation\, and understanding capabilities and issues of large language models. DTSTART;TZID=America/New_York:20221014T120000 DTEND;TZID=America/New_York:20221014T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:He He (New York University) “What We Talk about When We Talk about Spurious Correlations in NLP” URL:https://www.clsp.jhu.edu/events/he-he-new-york-university/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\n
Abstr act
\nModel robustness and spuri ous correlations have received increasing attention in the NLP community\, both in methods and evaluation. The term “spurious correlation” is overlo aded though and can refer to any undesirable shortcuts learned by the mode l\, as judged by domain experts.
\nWhen designing mitigation algorithms\, we often (implicitly) assume that a spurious feature is irrelevant for prediction. However\, many features in NLP (e.g. word overlap and negation) are not spurious in the sense that the background is spurious for classifying objects in an image. In contra st\, they carry important information that’s needed to make predictions by humans. In this talk\, we argue that it is more productive to characteriz e features in terms of their necessity and sufficiency for prediction. We then discuss the implications of this categorization in representation\, l earning\, and evaluation.
\nBiography
\nHe He is an Assistant Professor in the Department of Computer Science and the C enter for Data Science at New York University. She obtained her PhD in Com puter Science at the University of Maryland\, College Park. Before joining NYU\, she spent a year at AWS AI and was a post-doc at Stanford Universit y before that. She is interested in building robust and trustworthy NLP sy stems in human-centered settings. Her recent research focus includes robus t language understanding\, collaborative text generation\, and understandi ng capabilities and issues of large language models.
\nAbstr act
\nAbstr act
\nModern learning architectures for natural language processing have been very successful in incorporating a huge amount of texts into their parameters. However\, by and large\, such models store and use knowledge in distributed and decentralized ways. This proves unreliable and makes the models ill-suited for knowledge-intensive tasks that require reasoning over factual information in linguistic expre ssions. In this talk\, I will give a few examples of exploring alternativ e architectures to tackle those challenges. In particular\, we can improve the performance of such (language) models by representing\, storing and a ccessing knowledge in a dedicated memory component.
\nThis talk is based on several joint works with Yury Zemlyanskiy (Goo gle Research)\, Michiel de Jong (USC and Google Research)\, William Cohen (Google Research and CMU) and our other collaborators in Google Research.< /p>\n
Biography
\nFei is a research scientist at Google Research. Before that\, he was a Professor of Computer Science at U niversity of Southern California. His primary research interests are machi ne learning and its application to various AI problems: speech and languag e processing\, computer vision\, robotics and recently weather forecast an d climate modeling. He has a PhD (2007) from Computer and Information Sc ience from U. of Pennsylvania and B.Sc and M.Sc in Biomedical Engineering from Southeast University (Nanjing\, China).
\n X-TAGS;LANGUAGE=en-US:2022\,October\,Sha END:VEVENT BEGIN:VEVENT UID:ai1ec-22403@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nVoice conversion (VC) is a significant aspect of arti ficial intelligence. It is the study of how to convert one’s voice to soun d like that of another without changing the linguistic content. Voice conv ersion belongs to a general technical field of speech synthesis\, which co nverts text to speech or changes the properties of speech\, for example\, voice identity\, emotion\, and accents. Voice conversion involves multiple speech processing techniques\, such as speech analysis\, spectral convers ion\, prosody conversion\, speaker characterization\, and vocoding. With t he recent advances in theory and practice\, we are now able to produce hum an-like voice quality with high speaker similarity. In this talk\, Dr. Sis man will present the recent advances in voice conversion and discuss their promise and limitations. Dr. Sisman will also provide a summary of the av ailable resources for expressive voice conversion research.\nBiography\nDr . Berrak Sisman (Member\, IEEE) received the Ph.D. degree in electrical an d computer engineering from National University of Singapore in 2020\, ful ly funded by A*STAR Graduate Academy under Singapore International Graduat e Award (SINGA). She is currently working as a tenure-track Assistant Prof essor at the Erik Jonsson School Department of Electrical and Computer Eng ineering at University of Texas at Dallas\, United States. Prior to joinin g UT Dallas\, she was a faculty member at Singapore University of Technolo gy and Design (2020-2022). She was a Postdoctoral Research Fellow at the N ational University of Singapore (2019-2020). She was an exchange doctoral student at the University of Edinburgh and a visiting scholar at The Centr e for Speech Technology Research (CSTR)\, University of Edinburgh (2019). She was a visiting researcher at RIKEN Advanced Intelligence Project in Ja pan (2018). Her research is focused on machine learning\, signal processin g\, emotion\, speech synthesis and voice conversion.\nDr. Sisman has serve d as the Area Chair at INTERSPEECH 2021\, INTERSPEECH 2022\, IEEE SLT 2022 and as the Publication Chair at ICASSP 2022. She has been elected as a me mber of the IEEE Speech and Language Processing Technical Committee (SLTC) in the area of Speech Synthesis for the term from January 2022 to Decembe r 2024. She plays leadership roles in conference organizations and active in technical committees. She has served as the General Coordinator of the Student Advisory Committee (SAC) of International Speech Communication Ass ociation (ISCA). DTSTART;TZID=America/New_York:20221104T120000 DTEND;TZID=America/New_York:20221104T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Berrak Sisman (University of Texas at Dallas) “Speech Synthesis and Voice Conversion: Machine Learning can Mimic Anyone’s Voice” URL:https://www.clsp.jhu.edu/events/berrak-sisman-university-of-texas-at-da llas/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nVoice conversion (VC) is a significant aspect of arti ficial intelligence. It is the study of how to convert one’s voice to soun d like that of another without changing the linguistic content. Voice conv ersion belongs to a general technical field of speech synthesis\, which co nverts text to speech or changes the properties of speech\, for example\, voice identity\, emotion\, and accents. Voice conversion involves multiple speech processing techniques\, such as speech analysis\, spectral convers ion\, prosody conversion\, speaker characterization\, and vocoding. With t he recent advances in theory and practice\, we are now able to produce hum an-like voice quality with high speaker similarity. In this talk\, Dr. Sis man will present the recent advances in voice conversion and discuss their promise and limitations. Dr. Sisman will also provide a summary of the av ailable resources for expressive voice conversion research.
\nDr. Berrak Sisman (Member\, IEEE) received th e Ph.D. degree in electrical and computer engineering from National Univer sity of Singapore in 2020\, fully funded by A*STAR Graduate Academy under Singapore International Graduate Award (SINGA). She is currently working a s a tenure-track Assistant Professor at the Erik Jonsson School Department of Electrical and Computer Engineering at University of Texas at Dallas\, United States. Prior to joining UT Dallas\, she was a faculty member at S ingapore University of Technology and Design (2020-2022). She was a Postdo ctoral Research Fellow at the National University of Singapore (2019-2020) . She was an exchange doctoral student at the University of Edinburgh and a visiting scholar at The Centre for Speech Technology Research (CSTR)\, U niversity of Edinburgh (2019). She was a visiting researcher at RIKEN Adva nced Intelligence Project in Japan (2018). Her research is focused on mach ine learning\, signal processing\, emotion\, speech synthesis and voice co nversion.
\nDr. Sisman has served as the Area Chair at INTERSPEECH 2 021\, INTERSPEECH 2022\, IEEE SLT 2022 and as the Publication Chair at ICA SSP 2022. She has been elected as a member of the IEEE Speech and Language Processing Technical Committee (SLTC) in the area of Speech Synthesis for the term from January 2022 to December 2024. She plays leadership roles i n conference organizations and active in technical committees. She has ser ved as the General Coordinator of the Student Advisory Committee (SAC) of International Speech Communication Association (ISCA).
\n X-TAGS;LANGUAGE=en-US:2022\,November\,Sisman END:VEVENT BEGIN:VEVENT UID:ai1ec-22408@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nAI-powered applications increasingly adopt Deep Neura l Networks (DNNs) for solving many prediction tasks\, leading to more than one DNNs running on resource-constrained devices. Supporting many models simultaneously on a device is challenging due to the linearly increased co mputation\, energy\, and storage costs. An effective approach to address t he problem is multi-task learning (MTL) where a set of tasks are learned j ointly to allow some parameter sharing among tasks. MTL creates multi-task models based on common DNN architectures and has shown significantly redu ced inference costs and improved generalization performance in many machin e learning applications. In this talk\, we will introduce our recent effor ts on leveraging MTL to improve accuracy and efficiency for edge computing . The talk will introduce multi-task architecture design systems that can automatically identify resource-efficient multi-task models with low infer ence costs and high task accuracy.\n\nBiography\n\n\nHui Guan is an Assist ant Professor in the College of Information and Computer Sciences (CICS) a t the University of Massachusetts Amherst\, the flagship campus of the UMa ss system. She received her Ph.D. in Electrical Engineering from North Car olina State University in 2020. Her research lies in the intersection betw een machine learning and systems\, with an emphasis on improving the speed \, scalability\, and reliability of machine learning through innovations i n algorithms and programming systems. Her current research focuses on both algorithm and system optimizations of deep multi-task learning and graph machine learning. DTSTART;TZID=America/New_York:20221111T120000 DTEND;TZID=America/New_York:20221111T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Hui Guan (University of Massachusetts Amherst) “Towards Accurate an d Efficient Edge Computing Via Multi-Task Learning” URL:https://www.clsp.jhu.edu/events/hui-guan-university-of-massachusetts-am herst/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nAbstr act
\nDriven by the goal of eradicating language barriers o n a global scale\, machine translation has solidified itself as a key focu s of artificial intelligence research today. However\, such efforts have c oalesced around a small subset of languages\, leaving behind the vast majo rity of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe\, high-quality results\, all while ke eping ethical considerations in mind? In this talk\, I introduce No Langua ge Left Behind\, an initiative to break language barriers for low-resource languages. In No Language Left Behind\, we took on the low-resource langu age translation challenge by first contextualizing the need for translatio n support through exploratory interviews with native speakers. Then\, we c reated datasets and models aimed at narrowing the performance gap between low and high-resource languages. We proposed multiple architectural and tr aining improvements to counteract overfitting while training on thousands of tasks. Critically\, we evaluated the performance of over 40\,000 differ ent translation directions using a human-translated benchmark\, Flores-200 \, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety. Our model achiev es an improvement of 44% BLEU relative to the previous state-of-the-art\, laying important groundwork towards realizing a universal translation syst em in an open-source manner.
\nBiography
\nAngela is a research scientist at Meta AI Research in Ne w York\, focusing on supporting efforts in speech and language research. R ecent projects include No Language Left Behind (https://ai.facebook.com/research/no-language-left-be hind/) and Universal Speech Translation for Unwritten Languages (https://ai.facebook.com/blog/ai-translation -hokkien/). Before translation\, Angela previously focused on research in on-device models for NLP and computer vision and text generation.
\n\n X-TAGS;LANGUAGE=en-US:2022\,Fan\,November END:VEVENT BEGIN:VEVENT UID:ai1ec-22417@www.clsp.jhu.edu DTSTAMP:20240329T125237Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nOne of the keys to success in machine learning applic ations is to improve each user’s personal experience via personalized mode ls. A personalized model can be a more resource-efficient solution than a general-purpose model\, too\, because it focuses on a particular sub-probl em\, for which a smaller model architecture can be good enough. However\, training a personalized model requires data from the particular test-time user\, which are not always available due to their private nature and tech nical challenges. Furthermore\, such data tend to be unlabeled as they can be collected only during the test time\, once after the system is deploye d to user devices. One could rely on the generalization power of a generic model\, but such a model can be too computationally/spatially complex for real-time processing in a resource-constrained device. In this talk\, I w ill present some techniques to circumvent the lack of labeled personal dat a in the context of speech enhancement. Our machine learning models will r equire zero or few data samples from the test-time users\, while they can still achieve the personalization goal. To this end\, we will investigate modularized speech enhancement models as well as the potential of self-sup ervised learning for personalized speech enhancement. Because our research achieves the personalization goal in a data- and resource-efficient way\, it is a step towards a more available and affordable AI for society.\nBio graphy\nMinje Kim is an associate professor in the Dept. of Intelligent Sy stems Engineering at Indiana University\, where he leads his research grou p\, Signals and AI Group in Engineering (SAIGE). He is also an Amazon Visi ting Academic\, consulting for Amazon Lab126. At IU\, he is affiliated wit h various programs and labs such as Data Science\, Cognitive Science\, Dep t. of Statistics\, and Center for Machine Learning. He earned his Ph.D. in the Dept. of Computer Science at the University of Illinois at Urbana-Cha mpaign. Before joining UIUC\, He worked as a researcher at ETRI\, a nation al lab in Korea\, from 2006 to 2011. Before then\, he received his Master’ s and Bachelor’s degrees in the Dept. of Computer Science and Engineering at POSTECH (Summa Cum Laude) and in the Division of Information and Comput er Engineering at Ajou University (with honor) in 2006 and 2004\, respecti vely. He is a recipient of various awards including NSF Career Award (2021 )\, IU Trustees Teaching Award (2021)\, IEEE SPS Best Paper Award (2020)\, and Google and Starkey’s grants for outstanding student papers in ICASSP 2013 and 2014\, respectively. He is an IEEE Senior Member and also a membe r of the IEEE Audio and Acoustic Signal Processing Technical Committee (20 18-2023). He is serving as an Associate Editor for EURASIP Journal of Audi o\, Speech\, and Music Processing\, and as a Consulting Associate Editor f or IEEE Open Journal of Signal Processing. He is also a reviewer\, program committee member\, or area chair for the major machine learning and signa l processing. He filed more than 50 patent applications as an inventor. DTSTART;TZID=America/New_York:20221202T120000 DTEND;TZID=America/New_York:20221202T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Minje Kim (Indiana University) “Personalized Speech Enhancement: Da ta- and Resource-Efficient Machine Learning” URL:https://www.clsp.jhu.edu/events/minje-kim-indiana-university/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\nOne of the keys to success in machine learning applic ations is to improve each user’s personal experience via personalized mode ls. A personalized model can be a more resource-efficient solution than a general-purpose model\, too\, because it focuses on a particular sub-probl em\, for which a smaller model architecture can be good enough. However\, training a personalized model requires data from the particular test-time user\, which are not always available due to their private nature and tech nical challenges. Furthermore\, such data tend to be unlabeled as they can be collected only during the test time\, once after the system is deploye d to user devices. One could rely on the generalization power of a generic model\, but such a model can be too computationally/spatially complex for real-time processing in a resource-constrained device. In this talk\, I will present some techniques to circumvent the lack of labeled personal data in the context of speech enhancement. Ou r machine learning models will require zero or few data samples from the t est-time users\, while they can still achieve the personalization goal. To this end\, we will investigate modularized speech enhancement models as w ell as the potential of self-supervised learning for personalized speech e nhancement. Because our research achieves the personalization goal in a da ta- and resource-efficient way\, it is a step towards a more available and affordable AI for society.
\nBiography
\nAbstr act
\nZipf’s law is commonly glossed by the aphorism “infre quent words are frequent\,” but in practice\, it has often meant that ther e are three types of words: frequent\, infrequent\, and out-of-vocabulary (OOV). Speech recognition solved the problem of frequent words in 1970 (wi th dynamic time warping). Hidden Markov models worked well for moderately infrequent words\, but the problem of OOV words was not solved until sequ ence-to-sequence neural nets de-reified the concept of a word. Many other social phenomena follow power-law distributions. The number of native sp eakers of the N’th most spoken language\, for example\, is 1.44 billion ov er N to the 1.09. In languages with sufficient data\, we have shown that monolingual pre-training outperforms multilingual pre-training. In less-f requent languages\, multilingual knowledge transfer can significantly redu ce phone error rates. In languages with no training data\, unsupervised A SR methods can be proven to converge\, as long as the eigenvalues of the l anguage model are sufficiently well separated to be measurable. Other syst ems of social categorization may follow similar power-law distributions. Disability\, for example\, can cause speech patterns that were never seen in the training database\, but not all disabilities need do so. The inabi lity of speech technology to work for people with even common disabilities is probably caused by a lack of data\, and can probably be solved by find ing better modes of interaction between technology researchers and the com munities served by technology.
\nBiography
\nMark Hasegawa-Johnson is a William L. Everitt Faculty Fellow of Electrical and Computer Engineering at the University of Illinois in Urbana-Champaig n. He has published research in speech production and perception\, source separation\, voice conversion\, and low-resource automatic speech recogni tion.
\n X-TAGS;LANGUAGE=en-US:2022\,December\,Hasegawa-Johnson END:VEVENT END:VCALENDAR