BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-21267@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nIn this talk\, I present a multipronged strategy for zero-shot cross-lingual Information Extraction\ , that is the construction of an IE model for some target language\, given existing annotations exclusively in some other language. This work is par t of the JHU team’s effort under the IARPA BETTER program. I explore data augmentation techniques including data projection and self-training\, and how different pretrained encoders impact them. We find through extensive e xperiments and extension of techniques that a combination of approaches\, both new and old\, leads to better performance than any one cross-lingual strategy in particular.
\nBiography
\nAbstract
\nSocial media allows resear chers to track societal and cultural changes over time based on language a nalysis tools. Many of these tools rely on statistical algorithms which ne ed to be tuned to specific types of language. Recent studies have question ed the robustness of longitudinal analyses based on statistical methods du e to issues of temporal bias and semantic shift. To what extent are change s in semantics over time affecting the reliability of longitudinal analyse s? We examine this question through a case study: understanding shifts in mental health during the course of the COVID-19 pandemic. We demonstrate t hat a recently-introduced method for measuring semantic shift may be used to proactively identify failure points of language-based models and improv e predictive generalization over time. Ultimately\, we find that these ana lyses are critical to producing accurate longitudinal studies of social me dia.
DTSTART;TZID=America/New_York:20220207T120000 DTEND;TZID=America/New_York:20220207T131500 LOCATION:In Person or Virtual Option @ https://wse.zoom.us/j/96735183473 @ 234 Ames Hall\, 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Keith Harrigian “The Problem of Semantic Shift in Longitudinal Monitoring of Social Media: A Case Study on Mental Health d uring the COVID-19 Pandemic” URL:https://www.clsp.jhu.edu/events/student-seminar-keith-harrigian-the-pro blem-of-semantic-shift-in-longitudinal-monitoring-of-social-media-a-case-s tudy-on-mental-health-during-the-covid-19-pandemic/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Harrigian END:VEVENT BEGIN:VEVENT UID:ai1ec-21277@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAs humans\, our understand
ing of language is grounded in a rich mental model about “how the world wo
rks” – that we learn through perception and interaction. We use this under
standing to reason beyond what we literally observe or read\, imagining ho
w situations might unfold in the world. Machines today struggle at this ki
nd of reasoning\, which limits how they can communicate with humans.
In my talk\, I will discuss th
ree lines of work to bridge this gap between machines and humans. I will f
irst discuss how we might measure grounded understanding. I will introduce
a suite of approaches for constructing benchmarks\, using machines in the
loop to filter out spurious biases. Next\, I will introduce PIGLeT: a mod
el that learns physical commonsense understanding by interacting with the
world through simulation\, using this knowledge to ground language. From a
n English-language description of an event\, PIGLeT can anticipate how the
world state might change – outperforming text-only models that are orders
of magnitude larger. Finally\, I will introduce MERLOT\, which learns abo
ut situations in the world by watching millions of YouTube videos with tra
nscribed speech. Through training objectives inspired by the developmental
psychology idea of multimodal reentry\, MERLOT learns to fuse language\,
vision\, and sound together into powerful representations. Together\, these directions suggest a pa
th forward for building machines that learn language rooted in the world.<
/p>\n
Biography
\nRowan Zellers is a final year P hD candidate at the University of Washington in Computer Science & Enginee ring\, advised by Yejin Choi and Ali Farhadi. His research focuses on enab ling machines to understand language\, vision\, sound\, and the world beyo nd these modalities. He has been recognized through an NSF Graduate Fellow ship and a NeurIPS 2021 outstanding paper award. His work has appeared in several media outlets\, including Wired\, the Washington Post\, and the Ne w York Times. In the past\, he graduated from Harvey Mudd College with a B .S. in Computer Science & Mathematics\, and has interned at the Allen Inst itute for AI.
DTSTART;TZID=America/New_York:20220214T120000 DTEND;TZID=America/New_York:20220214T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rowan Zellers (University of Washington) ” Grounding Language by Se eing\, Hearing\, and Interacting” URL:https://www.clsp.jhu.edu/events/rowan-zellers-university-of-washington- grounding-language-by-seeing-hearing-and-interacting/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Zellers END:VEVENT BEGIN:VEVENT UID:ai1ec-21280@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAs AI-driven lan guage interfaces (such as chat-bots) become more integrated into our lives \, they need to become more versatile and reliable in their communication with human users. How can we make progress toward building more “general” models that are capable of understanding a broader spectrum of language co mmands\, given practical constraints such as the limited availability of l abeled data?
\nIn this talk\, I will describe my research toward addressing this question along two dimensions of generality. First I will discuss progress in “breadth” — models that address a wider variety of tasks and abilities\, drawing inspiration from existing statistical le arning techniques such as multi-task learning. In particular\, I will show case a system that works well on several QA benchmarks\, resulting in stat e-of-the-art results on 10 benchmarks. Furthermore\, I will show its exten sion to tasks beyond QA (such as text generation or classification) that c an be “defined” via natural language. In the second part\, I will focus o n progress in “depth” — models that can handle complex inputs such as comp ositional questions. I will introduce Text Modular Networks\, a general fr amework that casts problem-solving as natural language communication among simpler “modules.” Applying this framework to compositional questions by leveraging discrete optimization and existing non-compositional closed-box QA models results in a model with strong empirical performance on multipl e complex QA benchmarks while providing human-readable reasoning.
\nI will conclude with future research directions toward broader N LP systems by addressing the limitations of the presented ideas and other missing elements needed to move toward more general-purpose interactive la nguage understanding systems.
\nBiography
\nDaniel Khashabi is a postdoctoral researcher at the Al len Institute for Artificial Intelligence (AI2)\, Seattle. Previously\, he completed his Ph.D. in Computer and Information Sciences at the Universit y of Pennsylvania in 2019. His interests lie at the intersection of artifi cial intelligence and natural language processing\, with a vision toward m ore general systems through unified algorithms and theories.
DTSTART;TZID=America/New_York:20220218T120000 DTEND;TZID=America/New_York:20220218T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Daniel Khashabi (Allen Institute for Artificial Intelligence) “The Quest Toward Generality in Natural Language Understanding” URL:https://www.clsp.jhu.edu/events/daniel-khashabi-allen-institute-for-art ificial-intelligence/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Khashabi END:VEVENT BEGIN:VEVENT UID:ai1ec-21487@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAbstract
\nSince it is increasingly h arder to opt out from interacting with AI technology\, people demand that AI is capable of maintaining contracts such that it supports agency and ov ersight of people who are required to use it or who are affected by it. To help those people create a mental model about how to interact with AI sys tems\, I extend the underlying models to self-explain—predict the label/an swer and explain this prediction. In this talk\, I will present how to gen erate (1) free-text explanations given in plain English that immediately t ell users the gist of the reasoning\, and (2) contrastive explanations tha t help users understand how they could change the text to get another labe l.
\nBiography
\nAna Marasović is a postdocto ral researcher at the Allen Institute for AI (AI2) and the Paul G. Allen S chool of Computer Science & Engineering at University of Washington. Her r esearch interests broadly lie in the fields of natural language processing \, explainable AI\, and vision-and-language learning. Her projects are mot ivated by a unified goal: improve interaction and control of the NLP syste ms to help people make these systems do what they want with the confidence that they’re getting exactly what they need. Prior to joining AI2\, Ana o btained her PhD from Heidelberg University.
\nHow to pronounce my name: the first name is Ana like in Spanish\, i.e.\, with a long “a” like in “water”\; regarding the last name: “mara” as in actress mara wilso n + “so” + “veetch”.
DTSTART;TZID=America/New_York:20220228T120000 DTEND;TZID=America/New_York:20220228T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Ana Marasović (Allen Institute for AI & University of Washington) “ Self-Explaining for Intuitive Interaction with AI” URL:https://www.clsp.jhu.edu/events/ana-marasovic-allen-institute-for-ai-un iversity-of-washington-self-explaining-for-intuitive-interaction-with-ai/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Marasovic END:VEVENT BEGIN:VEVENT UID:ai1ec-23304@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nTransformers are essential to pretraining. As we approach 5 years of BERT\, the connection between a ttention as architecture and transfer learning remains key to this central thread in NLP. Other architectures such as CNNs and RNNs have been used t o replicate pretraining results\, but these either fail to reach the same accuracy or require supplemental attention layers. This work revisits the semanal BERT result and considers pretraining without attention. We consid er replacing self-attention layers with recently developed approach for lo ng-range sequence modeling and transformer architecture variants. Specific ally\, inspired by recent papers like the structured space space sequence model (S4)\, we use simple routing layers based on state-space models (SSM ) and a bidirectional model architecture based on multiplicative gating. W e discuss the results of the proposed Bidirectional Gated SSM (BiGS) and p resent a range of analysis into its properties. Results show that architec ture does seem to have a notable impact on downstream performance and a di fferent inductive bias that is worth exploring further.
\nBi ography
\nAbstract
\nWhile large language model s have advanced the state-of-the-art in natural language processing\, thes e models are trained on large-scale datasets\, which may include harmful i nformation. Studies have shown that as a result\, the models exhibit socia l biases and generate misinformation after training. In this talk\, I will discuss my work on analyzing and interpreting the risks of large language models across the areas of fairness\, trustworthiness\, and safety. I wil l first describe my research in the detection of dialect bias between Afri can American English (AAE) vs. Standard American English (SAE). The second part investigates the trustworthiness of models through the memorization and subsequent generation of conspiracy theories. I will end my talk with recent work in AI safety regarding text that may lead to physical harm.
\nBiography
\nSharon is a 5th-year Ph.D. candid ate at the University of California\, Santa Barbara\, where she is advised by Professor William Wang. Her research interests lie in natural language processing\, with a focus on Responsible AI. Sharon’s research spans the subareas of fairness\, trustworthiness\, and safety\, with publications in ACL\, EMNLP\, WWW\, and LREC. She has spent summers interning at AWS\, Me ta\, and Pinterest. Sharon is a 2022 EECS Rising Star and a current recipi ent of the Amazon Alexa AI Fellowship for Responsible AI.
DTSTART;TZID=America/New_York:20230206T120000 DTEND;TZID=America/New_York:20230206T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Sharon Levy (University of California\, Santa Barbara) “Responsible AI via Responsible Large Language Models” URL:https://www.clsp.jhu.edu/events/sharon-levy-university-of-california-sa nta-barbara-responsible-ai-via-responsible-large-language-models/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,February\,Levy END:VEVENT BEGIN:VEVENT UID:ai1ec-23308@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nBiases in datasets\, or un intentionally introduced spurious cues\, are a common source of misspecifi cation in machine learning. Performant models trained on such data can gen der stereotype or be brittle under distribution shift. In this talk\, we present several results in multimodal and question answering applications studying sources of dataset bias\, and several mitigation methods. We pro pose approaches where known dimensions of dataset bias are explicitly fact ored out of a model during learning\, without needing to modify data. Fina lly\, we ask whether dataset biases can be attributable to annotator behav ior during annotation. Drawing inspiration from work in psychology on cogn itive biases\, we show certain behavioral patterns are highly indicative o f the creation of problematic (but valid) data instances in question answe ring. We give evidence that many existing observations around how dataset bias propagates to models can be attributed to data samples created by ann otators we identify.
\nBiography
\nMark Ya tskar is an Assistant Professor at University of Pennsylvania in th e department of Computer and Information Science. He did his PhD at Univer sity of Washington co-advised by Luke Zettlemoyer and Ali Farhadi. He was a Young Investigator at the Allen Institute for Artificial Intelligence fo r several years working with their computer vision team\, Prior. His work spans Natural Language Processing\, Computer Vision\, and Fairness in Mach ine Learning. He received a Best Paper Award at EMNLP for work on gender b ias amplification\, and his work has been featured in Wired and the New Yo rk Times.
\nDTSTART;TZID=America/New_York:20230210T120000 DTEND;TZID=America/New_York:20230210T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Mark Yatskar (University of Pennsylvania) “Understanding Dataset Bi ases: Behavioral Indicators During Annotation and Contrastive Mitigations” URL:https://www.clsp.jhu.edu/events/mark-yatskar-university-of-pennsylvania / X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,February\,Yatskar END:VEVENT BEGIN:VEVENT UID:ai1ec-23314@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nWhile GPT models have shown impressive performance on summa rization and open-ended text generation\, it’s important to assess their a bilities on more constrained text generation tasks that require significan t and diverse rewritings. In this talk\, I will discuss the challenges of evaluating systems that are highly competitive and perform close to humans on two such tasks: (i) paraphrase generation and (ii) text simplification . To address these challenges\, we introduce an interactive Rank-and-Rate evaluation framework. Our results show that GPT-3.5 has made a major step up from fine-tuned T5 in paraphrase generation\, but still lacks the diver sity and creativity of humans who spontaneously produce large quantities o f paraphrases.
\nAdditionally\, we demonstrate that GPT-3.5 performs similarly to a sin gle human in text simplification\, which makes it difficult for existing a utomatic evaluation metrics to distinguish between the two. To overcome th is shortcoming\, we propose LENS\, a learnable evaluation metric that outp erforms SARI\, BERTScore\, and other existing methods in both automatic ev aluation and minimal risk decoding for text generation.
\nBiography
\nWei Xu is an assistant professor in the School of Interactive Com puting at the Georgia Institute of Technology\, where she is also affiliat ed with the new NSF AI CARING Institute and Machine Learning Center. She r eceived her Ph.D. in Computer Science from New York University and her B.S . and M.S. from Tsinghua University. Xu’s research interests are in natura l language processing\, machine learning\, and social media\, with a focus on text generation\, stylistics\, robustness and controllability of machi ne learning models\, and reading and writing assistive technology. She is a recipient of the NSF CAREER Award\, CrowdFlower AI for Everyone Award\, Criteo Faculty Research Award\, and Best Paper Award at COLING’18. She has also received funds from DARPA and IARPA. She is an elected member of the NAACL executive board and regularly serves as a senior area chair for AI/ NLP conferences.
DTSTART;TZID=America/New_York:20230224T120000 DTEND;TZID=America/New_York:20230224T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Wei Xu (Georgia Tech) “GPT-3 vs Humans: Rethinking Evaluation of Na tural Language Generation” URL:https://www.clsp.jhu.edu/events/wei-xu-georgia-tech/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,February\,Xu END:VEVENT BEGIN:VEVENT UID:ai1ec-23316@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nUnderstanding the implicat ions underlying a text is critical to assessing its impact\, in particular the social dynamics that may result from a reading of the text. This requ ires endowing artificial intelligence (AI) systems with pragmatic reasonin g\, for example to correctly conclude that the statement “Epidemics and ca ses of disease in the 21st century are “staged”” relates to unfounded cons piracy theories. In this talk\, I discuss how shortcomings in the ability of current AI systems to reason about pragmatics present challenges to equ itable detection of false or harmful language. I demonstrate how these sho rtcomings can be addressed by imposing human-interpretable structure on de ep learning architectures using insights from linguistics.
\n< p> In the first part of the talk\, I descri be how adversarial text generation algorithms can be used to improve robus tness of content moderation systems. I then introduce a pragmatic formalis m for reasoning about harmful implications conveyed by social media text. I show how this pragmatic approach can be combined with generative neural language models to uncover implications of news headlines. I also address the bottleneck to progress in text generation posed by gaps in evaluation of factuality. I conclude by showing how context-aware content moderation can be used to ensure safe interactions with conversational agents. \nBiography
\nSaadia Gabriel is a PhD candidate in the Paul G. Al len School of Computer Science & Engineering at the University of Washingt on\, advised by Prof. Yejin Choi and Prof. Franziska Roesner. Her research revolves around natural language processing and m achine learning\, with a particular focus on building systems for understa nding how social commonsense manifests in text (i.e. how do people typical ly behave in social scenarios)\, as well as mitigating spread of false or harmful text (e.g. Covid-19 misinformation). Her work has been covered by a wide range of media outlets like Forbes and TechCrunch. It has also rece ived a 2019 ACL best short paper nomination\, a 2019 IROS RoboCup best pap er nomination and won a best paper award at the 2020 WeCNLP summit. Prior to her PhD\, Saadia received a BA summa cum laude from Mount Hol yoke College in Computer Science and Mathematics.
\nDTSTART;TZID=America/New_York:20230227T120000 DTEND;TZID=America/New_York:20230227T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Saadia Gabriel (University of Washington) “Socially Responsible and Factual Reasoning for Equitable AI Systems” URL:https://www.clsp.jhu.edu/events/saadia-gabriel-university-of-washington -socially-responsible-and-factual-reasoning-for-equitable-ai-systems/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,February\,Gabriel END:VEVENT BEGIN:VEVENT UID:ai1ec-23312@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nAdvanced neural language m odels have grown ever larger and more complex\, pushing forward the limits of language understanding and generation\, while diminishing interpretabi lity. The black-box nature of deep neural networks blocks humans from unde rstanding them\, as well as trusting and using them in real-world applicat ions. This talk will introduce interpretation techniques that bridge the g ap between humans and models for developing trustworthy natural language p rocessing
\n (NLP). I will first show how to explain black-box models and evaluate their explanations for understanding their p rediction behavior. Then I will introduce how to improve the interpretabil ity of neural language models by making their decision-making transparent and rationalized. Finally\, I will discuss how to diagnose and improve mod els (e.g.\, robustness) through the lens of explanations. I will conclude with future research directions that are centered around model interpretab ility and committed to facilitating communications and interactions betwee n intelligent machines\, system developers\, and end users for long-term t rustworthy AI.Biography
\nHanjie Chen is a Ph.D. candidate in Computer Science at the University of Virginia\, advis ed by Prof. Yangfeng Ji. Her research interests lie in Trustworthy AI\, Na tural Language Processing (NLP)\, and
DTSTART;TZID=America/New_York:20230313T120000 DTEND;TZID=America/New_York:20230313T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Hanjie Chen (University of Virginia) “Bridging Humans and Machines: Techniques for Trustworthy NLP” URL:https://www.clsp.jhu.edu/events/hanjie-chen-university-of-virginia/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Chen\,February END:VEVENT BEGIN:VEVENT UID:ai1ec-24241@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: Interpretabl e Machine Learning. She develops interpretation techniques to explain neur al language models and make their prediction behavior transparent and reli able. She is a recipient of the Carlos and Esther Farrar Fellowship and th e Best Poster Award at the ACM CAPWIC 2021. Her work has been published at top-tier NLP/AI conferences (e.g.\, ACL\, AAAI\, EMNLP\, NAACL) and selec ted by the National Center for Women & Information Technology (NCWIT) Coll egiate Award Finalist 2021. She (as the primary instructor) co-designed an d taught the course\, Interpretable Machine Learning\, and was awarded the UVA CS Outstanding Graduate Teaching Award and University-wide Graduate T eaching Awards Nominee (top 5% of graduate instructors). More details can be found at https://www.cs.virginia.edu/~hc9mxAbstract
\nOur research focuses on im proving speech processing algorithms\, such as automatic speech recognitio n (ASR)\, speaker identification\, and depression detection\, under challe nging conditions such as limited data (for example\, children’s or clinica l speech)\, mismatched conditions (for example\, training on read speech w hile recognizing conversational speech)\, and noisy speech\, using a hybri d data-driven and knowledge-based approach. This approach requires underst anding of both machine learning approaches and of the human speech product ion and perception systems. I will summarize in this talk our work on chil dren’s ASR using self-supervised models\, detecting depression from speech signals using novel speaker disentaglement techniques\, and automating sc oring of children’s reading tasks with both ASR and innovative NLP algorit hms.
\nBiography
\nAbeer Alwan received her P h.D. in Electrical Engineering and Computer Science from MIT in 1992. Sinc e then\, she has been with the ECE department at UCLA where she is now a F ull Professor and directs the Speech Processing and Auditory Perception La boratory. She is the recipient of the NSF Research Initiation and Career A wards\, NIH FIRST Award\, UCLA-TRW Excellence in Teaching Award\, Okawa Fo undation Award in Telecommunication\, and the Engineer’s Council Educator Award. She is a Fellow of the Acoustical Society of America\, IEEE\, and I nternational Speech Communication Assoc. (ISCA). She was a Fellow at the R adcliffe Institute\, Harvard University\, co-Editor in Chief of Speech Com munication\, Associate Editor of JASA and IEEE TSALP\, a Distinguished Lec turer of ISCA\, a member of the IEEE Signal Processing Board of Governers and she is currently on the advisory board of ISCA and the UCLA-Amazon Sci ence Hub for Humanity and AI.
DTSTART;TZID=America/New_York:20240202T120000 DTEND;TZID=America/New_York:20240202T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Abeer Alwan (UCLA) “Dealing with Limited Speech Data and Variabilit y: Three case studies” URL:https://www.clsp.jhu.edu/events/abeer-alwan-ucla/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,Alwan\,February END:VEVENT BEGIN:VEVENT UID:ai1ec-24425@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\n
Over the past three decades\, the fields of automatic speech recogn ition (ASR) and machine translation (MT) have witnessed remarkable advance ments\, leading to exciting research directions such as speech-to-text tra nslation (ST). This talk will delve into the domain of conversational ST\, an essential facet of daily communication\, which presents unique challen ges including spontaneous informal language\, the presence of disfluencies \, high context dependence and a scarcity of ST paired data.
\nAbstract
\nb>There is an enormous data gap between how AI systems and children learn language: The best LLMs now learn language from text with a word count in the trillions\, whereas it would take a child roughly 100K years to reach those numbers through speech (Frank\, 2023\, “Bridging the data gap”). The re is also a clear generalization gap: whereas machines struggle with syst ematic generalization\, people excel. For instance\, once a child learns h ow to “skip\,” they immediately know how to “skip twice” or “skip around t he room with their hands up” due to their compositional skills. In this ta lk\, I’ll describe two case studies in addressing these gaps:
\n1) The data gap: We train deep neural networks from scrat ch (using DINO\, CLIP\, etc.)\, not on large-scale data from the web\, but through the eyes and ears of a single child. Using head-mounted video rec ordings from a child (61 hours of video slices over 19 months)\, we show h ow deep neural networks can acquire many word-referent mappings\, generali ze to novel visual referents\, and achieve multi-modal alignment. Our resu lts demonstrate how today’s AI models are capable of learning key aspects of children’s early knowledge from realistic input.
\n2) The generalization gap: Can neural networks capture human-like s ystematic generalization? We address a 35-year-old debate catalyzed by Fod or and Pylyshyn’s classic article\, which argued that standard neural netw orks are not viable models of the mind because they lack systematic compos itionality — the algebraic ability to understand and produce novel combina tions from known components. We’ll show how neural network can achieve hum an-like systematic generalization when trained through meta-learning for c ompositionality (MLC)\, a new method for optimizing the compositional skil ls of neural networks through practice. With MLC\, a neural network can ma tch human performance and solve several machine learning benchmarks.
\nGiven this work\, we’ll discuss the paths forward for building machines that learn\, generalize\, and interact in more human -like ways based on more natural input.
\nRelat ed articles:
\nVong\, W. K.\, Wang\, W.\, Orhan \, A. E.\, and Lake\, B. M (2024). Grounded language acquisition through t he eyes and ears of a single child. Science\, 383.
\nOrhan\, A. E.\, and Lake\, B. M. (in press). Learning high-le vel visual representations from a child’s perspective without strong induc tive biases. Nature Machine Intelligence.
\nLake\, B. M. and Baroni\, M. (2023). Human-like systematic generalizat ion through a meta-learning neural network. Nature\, 623\, 115-121.
\nBiography
\nBrenden M . Lake is an Assistant Professor of Psychology and Data Science at New Yor k University. He received his M.S. and B.S. in Symbolic Systems from Stanf ord University in 2009\, and his Ph.D. in Cognitive Science from MIT in 20 14. He was a postdoctoral Data Science Fellow at NYU from 2014-2017. Brend en is a recipient of the Robert J. Glushko Prize for Outstanding Doctoral Dissertation in Cognitive Science\, he is a MIT Technology Review Innovato r Under 35\, and his research was selected by Scientific American as one o f the 10 most important advances of 2016. Brenden’s research focuses on co mputational problems that are easier for people than they are for machines \, such as learning new concepts\, creating new concepts\, learning-to-lea rn\, and asking questions.
\nAbstract
\nLarge language models like ChatGPT have shown extraordinary abilities for writing. While impressive at first glance\, large language models aren’t perfect and often make mist akes humans would not make. The main architecture behind ChatGPT mostly do esn’t differ from early neural networks\, and as a consequence\, carries s ome of the same limitations. My work revolves around the use of neural net works like ChatGPT mixed with symbolic methods from early AI and how these two families of methods can combine to create more robust AI. I talk abou t some of the neurosymbolic methods I used for applications in story gener ation and understanding — with the goal of eventually creating AI that can play Dungeons & Dragons. I also discuss pain points that I found for impr oving accessible communication and show how large language models can supp lement such communication.
\nBiography
\nAbstract
\nWe introduce STAR (Stream Transduction with Anchor Representations)\, a novel Transformer-based mode l designed for efficient sequence-to-sequence transduction over streams. S TAR dynamically segments input streams to create compressed anchor represe ntations\, achieving nearly lossless compression (12x) in Automatic Speech Recognition (ASR) and outperforming existing methods. Moreover\, STAR dem onstrates superior segmentation and latency-quality trade-offs in simultan eous speech-to-text tasks\, optimizing latency\, memory footprint\, and qu ality.
DTSTART;TZID=America/New_York:20240219T120000 DTEND;TZID=America/New_York:20240219T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Steven Tan “Streaming Sequence Transduction through Dynamic Compres sion” URL:https://www.clsp.jhu.edu/events/steven-tan-streaming-sequence-transduct ion-through-dynamic-compression/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,February\,Tan END:VEVENT BEGIN:VEVENT UID:ai1ec-24429@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nI discuss the application of Foundation Models in Astronomy through the collaborative efforts of the UniverseTBD consortium with a mission to democratize Science for everyone . One of our key objectives is to overcome the limitations of general-purp ose Foundation Models\, such as producing limited information in specializ ed fields. To this end\, we have developed the first specialized large lan guage model for Astronomy\, AstroLLaMa-1. This model\, enhanced by exposur e to domain-specific literature from the NASA Astrophysics Data System and ArXiv\, demonstrates improved text completion and embedding capabilities over existent GPT models. I further discuss the potential of LLMs in gener ating complex scientific hypotheses and extracting meaningful insights fro m astronomy literature. Our findings\, validated by human experts\, demons trate the LLM capability in informed scientific critique and uncover intri guing patterns in the embedding space\, highlighting the potential of LLMs to augment scientific inquiry. I will also discuss preliminary work with the multi-modal model AstroLLaVA\, which allows us to interact with astron omical images via natural language. Through the work of UniverseTBD\, we a im to explore how artificial intelligence can assist human intelligence in Astronomy and\, more broadly\, Science.
\nBiography
\nIoana Ciucă\, who goes by Jo\, is an interdisciplinary Jubilee J oint Fellow at the Australian National University\, working across the Sch ool of Computing and the Research School of Astronomy & Astrophysics. Befo re joining ANU\, Jo finished her PhD in Astrophysics at University College London in the United Kingdom\, where she worked at the intersection of As tronomy and Machine Learning to understand the formation and evolution his tory of our Galaxy\, the Milky Way. Jo is now focusing on utilizing founda tion models that benefit researchers everywhere\, working alongside the Un iverseTBD team of more than 30 astronomers\, engineers\, ML practitioners and enthusiasts worldwide.
\nDTSTART;TZID=America/New_York:20240223T120000 DTEND;TZID=America/New_York:20240223T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Ioana Ciuca (Australian National University)”A Universe To Be Decid ed: Towards Specialized Foundation Models for Advancing Astronomy” URL:https://www.clsp.jhu.edu/events/ioana-ciuca-australian-national-univers ity/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,Ciuca\,February END:VEVENT BEGIN:VEVENT UID:ai1ec-24457@www.clsp.jhu.edu DTSTAMP:20240329T112407Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:
Abstract
\nAs artificial intelligence (AI) continues to rapidly expand into existing healthcare infrastructure – e.g.\, clinical decision support\, administrative tasks\, and public hea lth surveillance – it is perhaps more important than ever to reflect on th e broader purpose of such systems. While much focus has been on the potent ial for this technology to improve general health outcomes\, there also ex ists a significant\, but understated\, opportunity to use this technology to address health-related disparities. Accomplishing the latter depends no t only on our ability to effectively identify addressable areas of systemi c inequality and translate them into tasks that are machine learnable\, bu t also our ability to measure\, interpret\, and counteract barriers in tra ining data that may inhibit robustness to distribution shift upon deployme nt (i.e.\, new populations\, temporal dynamics). In this talk\, we will di scuss progress made along both of these dimensions. We will begin by provi ding background on the state of AI for promoting health equity. Then\, we will present results from a recent clinical phenotyping project and discus s their implication on prevailing views regarding language model robustnes s in clinical applications. Finally\, we will showcase ongoing efforts to proactively address systemic inequality in healthcare by identifying and c haracterizing stigmatizing language in medical records.
DTSTART;TZID=America/New_York:20240226T120000 DTEND;TZID=America/New_York:20240226T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Keith Harrigian (JHU) “Fighting Bias From Bias: Robust Natural Lang uage Processing Techniques to Promote Health Equity” URL:https://www.clsp.jhu.edu/events/keith-harrigian-jhu-fighting-bias-from- bias-robust-natural-language-processing-techniques-to-promote-health-equit y/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,February\,Harrigian END:VEVENT END:VCALENDAR