BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-20117@www.clsp.jhu.edu DTSTAMP:20240329T005526Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nNeural sequence generation systems oftentimes generate sequences by searching for the most likely se quence under the learnt probability distribution. This assumes that the mo st likely sequence\, i.e. the mode\, under such a model must also be the b est sequence it has to offer (often in a given context\, e.g. conditioned on a source sentence in translation). Recent findings in neural machine tr anslation (NMT) show that the true most likely sequence oftentimes is empt y under many state-of-the-art NMT models. This follows a large list of oth er pathologies and biases observed in NMT and other sequence generation mo dels: a length bias\, larger beams degrading performance\, exposure bias\, and many more. Many of these works blame the probabilistic formulation of NMT or maximum likelihood estimation. We provide a different view on this : it is mode-seeking search\, e.g. beam search\, that introduces many of t hese pathologies and biases\, and such a decision rule is not suitable for the type of distributions learnt by NMT systems. We show that NMT models spread probability mass over many translations\, and that the most likely translation oftentimes is a rare event. We further show that translation d istributions do capture important aspects of translation well in expectati on. Therefore\, we advocate for decision rules that take into account the entire probability distribution and not just its mode. We provide one exam ple of such a decision rule\, and show that this is a fruitful research di rection.
\nBiography
\nI am an assistant professor (UD) in natural language processing at the Institute for Logic\, Language and Computation where I lead the Probabilistic Language L earning group.
\nMy work concerns the design of models and algor ithms that learn to represent\, understand\, and generate language data. E xamples of specific problems I am interested in include language modelling \, machine translation\, syntactic parsing\, textual entailment\, text cla ssification\, and question answering.
\nI also develop techniques to approach general machine learning problems such as probabilistic inferenc e\, gradient and density estimation.
\nMy interests sit at the inter section of disciplines such as statistics\, machine learning\, approximate inference\, global optimization\, formal languages\, and computational li nguistics.
\n\n
DTSTART;TZID=America/New_York:20210419T120000 DTEND;TZID=America/New_York:20210419T131500 LOCATION:via Zoom SEQUENCE:0 SUMMARY:Wilker Aziz (University of Amsterdam) “The Inadequacy of the Mode in Neural Machine Translation” URL:https://www.clsp.jhu.edu/events/wilker-aziz-university-of-amsterdam/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,April\,Aziz END:VEVENT BEGIN:VEVENT UID:ai1ec-21259@www.clsp.jhu.edu DTSTAMP:20240329T005526Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nNatural language processin g has been revolutionized by neural networks\, which perform impressively well in applications such as machine translation and question answering. D espite their success\, neural networks still have some substantial shortco mings: Their internal workings are poorly understood\, and they are notori ously brittle\, failing on example types that are rare in their training d ata. In this talk\, I will use the unifying thread of hierarchical syntact ic structure to discuss approaches for addressing these shortcomings. Firs t\, I will argue for a new evaluation paradigm based on targeted\, hypothe sis-driven tests that better illuminate what models have learned\; using t his paradigm\, I will show that even state-of-the-art models sometimes fai l to recognize the hierarchical structure of language (e.g.\, to conclude that “The book on the table is blue” implies “The table is blue.”) Second\ , I will show how these behavioral failings can be explained through analy sis of models’ inductive biases and internal representations\, focusing on the puzzle of how neural networks represent discrete symbolic structure i n continuous vector space. I will close by showing how insights from these analyses can be used to make models more robust through approaches based on meta-learning\, structured architectures\, and data augmentation.
\nBiography
\nTom McCoy is a PhD candidate in the Department of Cognitive Science at Johns Hopkins University. As an undergr aduate\, he studied computational linguistics at Yale. His research combin es natural language processing\, cognitive science\, and machine learning to study how we can achieve robust generalization in models of language\, as this remains one of the main areas where current AI systems fall short. In particular\, he focuses on inductive biases and representations of lin guistic structure\, since these are two of the major components that deter mine how learners generalize to novel types of input.
DTSTART;TZID=America/New_York:20220131T120000 DTEND;TZID=America/New_York:20220131T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Tom McCoy (Johns Hopkins University) “Opening the Black Box of Deep Learning: Representations\, Inductive Biases\, and Robustness” URL:https://www.clsp.jhu.edu/events/tom-mccoy-johns-hopkins-university-open ing-the-black-box-of-deep-learning-representations-inductive-biases-and-ro bustness/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,January\,McCoy END:VEVENT BEGIN:VEVENT UID:ai1ec-23302@www.clsp.jhu.edu DTSTAMP:20240329T005526Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20230130T120000 DTEND;TZID=America/New_York:20230130T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Daniel Fried (CMU) URL:https://www.clsp.jhu.edu/events/daniel-fried-cmu/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Fried\,January END:VEVENT BEGIN:VEVENT UID:ai1ec-24239@www.clsp.jhu.edu DTSTAMP:20240329T005526Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nNon-in vasive neural interfaces have the potential to transform human-computer in teraction by providing users with low friction\, information rich\, always available inputs. Reality Labs at Meta is developing such an interface fo r the control of augmented reality devices based on electromyographic (EMG ) signals captured at the wrist. Speech and audio technologies turn out to be especially well suited to unlocking the full potential of these signal s and interactions and this talk will present several specific problems an d the speech and audio approaches that have advanced us towards this ultim ate goal of effortless and joyful interfaces. We will provide the necessar y neuroscientific background to understand these signals\, describe automa tic speech recognition-inspired interfaces generating text and beamforming -inspired interfaces for identifying individual neurons\, and then explain how they connect with egocentric machine intelligence tasks that might re side on these devices.
\nBiography
\nMichael I Mandel is a Research Scientist in Reality Labs at Meta. Previously\, he was an Associate Professor of Computer and Information Science at Brooklyn College and the CUNY Graduate Center working at the intersection of machi ne learning\, signal processing\, and psychoacoustics. He earned his BSc i n Computer Science from the Massachusetts Institute of Technology and his MS and PhD with distinction in Electrical Engineering from Columbia Univer sity as a Fu Foundation Presidential Scholar. He was an FQRNT Postdoctoral Research Fellow in the Machine Learning laboratory (LISA/MILA) at the Uni versité de Montréal\, an Algorithm Developer at Audience Inc\, and a Resea rch Scientist in Computer Science and Engineering at the Ohio State Univer sity. His work has been supported by the National Science Foundation\, inc luding via a CAREER award\, the Alfred P. Sloan Foundation\, and Google\, Inc.
DTSTART;TZID=America/New_York:20240129T120000 DTEND;TZID=America/New_York:20240129T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Michael I Mandel (Meta) “Speech and Audio Processing in Non-Invasiv e Brain-Computer Interfaces at Meta” URL:https://www.clsp.jhu.edu/events/michael-i-mandel-cuny/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,January\,Mandel END:VEVENT END:VCALENDAR