BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-20117@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nNeural sequence generation systems oftentimes generate sequences by searching for the most likely se quence under the learnt probability distribution. This assumes that the mo st likely sequence\, i.e. the mode\, under such a model must also be the b est sequence it has to offer (often in a given context\, e.g. conditioned on a source sentence in translation). Recent findings in neural machine tr anslation (NMT) show that the true most likely sequence oftentimes is empt y under many state-of-the-art NMT models. This follows a large list of oth er pathologies and biases observed in NMT and other sequence generation mo dels: a length bias\, larger beams degrading performance\, exposure bias\, and many more. Many of these works blame the probabilistic formulation of NMT or maximum likelihood estimation. We provide a different view on this : it is mode-seeking search\, e.g. beam search\, that introduces many of t hese pathologies and biases\, and such a decision rule is not suitable for the type of distributions learnt by NMT systems. We show that NMT models spread probability mass over many translations\, and that the most likely translation oftentimes is a rare event. We further show that translation d istributions do capture important aspects of translation well in expectati on. Therefore\, we advocate for decision rules that take into account the entire probability distribution and not just its mode. We provide one exam ple of such a decision rule\, and show that this is a fruitful research di rection.
\nBiography
\nI am an assistant professor (UD) in natural language processing at the Institute for Logic\, Language and Computation where I lead the Probabilistic Language L earning group.
\nMy work concerns the design of models and algor ithms that learn to represent\, understand\, and generate language data. E xamples of specific problems I am interested in include language modelling \, machine translation\, syntactic parsing\, textual entailment\, text cla ssification\, and question answering.
\nI also develop techniques to approach general machine learning problems such as probabilistic inferenc e\, gradient and density estimation.
\nMy interests sit at the inter section of disciplines such as statistics\, machine learning\, approximate inference\, global optimization\, formal languages\, and computational li nguistics.
\n\n
DTSTART;TZID=America/New_York:20210419T120000 DTEND;TZID=America/New_York:20210419T131500 LOCATION:via Zoom SEQUENCE:0 SUMMARY:Wilker Aziz (University of Amsterdam) “The Inadequacy of the Mode in Neural Machine Translation” URL:https://www.clsp.jhu.edu/events/wilker-aziz-university-of-amsterdam/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,April\,Aziz END:VEVENT BEGIN:VEVENT UID:ai1ec-20120@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nRobotics@Google’s mission is to make robots useful in the real world through machine learning. We a re excited about a new model for robotics\, designed for generalization ac ross diverse environments and instructions. This model is focused on scala ble data-driven learning\, which is task-agnostic\, leverages simulation\, learns from past experience\, and can be quickly adapted to work in the r eal-world through limited interactions. In this talk\, we’ll share some of our recent work in this direction in both manipulation and locomotion app lications.
\nBiography
\nCarolina
Abstract
\nOver the last few years\, deep neural models have taken over the field of natural language processin g (NLP)\, brandishing great improvements on many of its sequence-level tas ks. But the end-to-end nature of these models makes it hard to figure out whether the way they represent individual words aligns with how language b uilds itself from the bottom up\, or how lexical changes in register and d omain can affect the untested aspects of such representations.
\nIn this talk\, I will present NYTWIT\, a dataset created to challenge large l anguage models at the lexical level\, tasking them with identification of processes leading to the formation of novel English words\, as well as wit h segmentation and recovery of the specific subclass of novel blends. I wi ll then present XRayEmb\, a method which alleviates the hardships of proce ssing these novelties by fitting a character-level encoder to the existing models’ subword tokenizers\; and conclude with a discussion of the drawba cks of current tokenizers’ vocabulary creation schemes.
\nBi ography
\nYuval Pinter is a Senior Lecturer in the Department of Comp uter Science at Ben-Gurion University of the Negev\, focusing on natural l anguage processing. Yuval got his PhD at t he Georgia Institute of Technology School of Interactive Computing as a Bl oomberg Data Science PhD Fellow. Before that\, he worked as a Research Eng ineer at Yahoo Labs and as a Computational Linguist at Ginger Software\, a nd obtained an MA in Linguistics and a BSc in CS and Mathematics\, both fr om Tel Aviv University. Yuval blogs (in He brew) about language matters on Dagesh Kal.
DTSTART;TZID=America/New_York:20210910T120000 DTEND;TZID=America/New_York:20210910T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD SEQUENCE:0 SUMMARY:Yuval Pinter (Ben-Gurion University – Virtual Visit) “Challenging a nd Adapting NLP Models to Lexical Phenomena” URL:https://www.clsp.jhu.edu/events/yuval-pinter/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,Pinter\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-20723@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nText simplification aims t o help audiences read and understand a piece of text through lexical\, syn tactic\, and discourse modifications\, while remaining faithful to its cen tral idea and meaning. Thanks to large-scale parallel corpora derived from Wikipedia and News\, much of modern-day text simplification research focu ses on sentence simplification\, transforming original\, more complex sent ences into simplified versions. In this talk\, I present new frontiers tha t focus on discourse operations. First\, we consider the challenging task of simplifying highly technical language\, in our case\, medical texts. We introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinic al topics. We then propose a new metric to quantify stylistic differentiat es between the two\, and models for paragraph-level simplification. Second \, we present the first data-driven study of inserting elaborations and ex planations during simplification\, and illustrate the richness and complex ities of this phenomenon.
\nBiography
\nAbstract
\nRaytheon BBN participated in the IARPA MATERIAL program\, whose objective is to enable rapid develop ment of language-independent methods for cross-lingual information retriev al (CLIR). The challenging CLIR task of retrieving documents written (or s poken) in one language so that they satisfy an information need expressed in a different language is exacerbated by unique challenges posed by the M ATERIAL program: limited training data for automatic speech recognition an d machine translation\, scant lexical resources\, non-standardized orthogr aphy\, etc. Furthermore\, the format of the queries and the “Query-Weighte d Value” performance measure are non-standard and not previously studied i n the IR community. In this talk\, we will describe the Raytheon BBN CLIR system\, which was successful at addressing the above challenges and uniqu e characteristics of the program.
\nBiography
\nDamianos Karakos has been at Raytheon BBN f or the past nine years\, where he is currently a Senior Principal Engineer \, Research. Before that\, he was research faculty at Johns Hopkins Univer sity. He has worked on several Government projects (e.g.\, DARPA GALE\, DA RPA RATS\, IARPA BABEL\, IARPA MATERIAL\, IARPA BETTER) and on a variety o f HLT-related topics (e.g.\, speech recognition\, speech activity detectio n\, keyword search\, information retrieval). He has published more than 60 peer-reviewed papers. His research interests lie at the intersection of h uman language technology and machine learning\, with an emphasis on statis tical methods. He obtained a PhD in Electrical Engineering from the Univer sity of Maryland\, College Park\, in 2002.
\n\n
Abstract
\nWhile there is a vast amou nt of text written about nearly any topic\, this is often difficult for so meone unfamiliar with a specific field to understand. Automated text simpl ification aims to reduce the complexity of a document\, making it more com prehensible to a broader audience. Much of the research in this field has traditionally focused on simplification sub-tasks\, such as lexical\, synt actic\, or sentence-level simplification. However\, current systems strugg le to consistently produce high-quality simplifications. Phrase-based mode ls tend to make too many poor transformations\; on the other hand\, recent neural models\, while producing grammatical output\, often do not make al l needed changes to the original text. In this thesis\, I discuss novel ap proaches for improving lexical and sentence-level simplification systems. Regarding sentence simplification models\, after noting that encouraging d iversity at inference time leads to significant improvements\, I take a cl oser look at the idea of diversity and perform an exhaustive comparison of diverse decoding techniques on other generation tasks. I also discuss the limitations in the framing of current simplification tasks\, which preven t these models from yet being practically useful. Thus\, I also propose a retrieval-based reformulation of the problem. Specifically\, starting with a document\, I identify concepts critical to understanding its content\, and then retrieve documents relevant for each concept\, re-ranking them ba sed on the desired complexity level.
\nBiography
\nI’m a research scientist at the HLTCOE at Johns Hopkins University. My primary research interests are in language generati on\, diverse and constrained decoding\, and information retrieval. During my PhD I focused mainly on the task of text simplification\, and now am wo rking on formulating structured prediction problems as end-to-end generati on tasks. I received my PhD in July 2021 from the University of Pennsylvan ia with Chris Callison-Burch and Marianna Apidianaki.
\nDTSTART;TZID=America/New_York:20211022T120000 DTEND;TZID=America/New_York:20211022T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Reno Kriz (HLTCOE – JHU) “Towards a Practically Useful Text Simplif ication System” URL:https://www.clsp.jhu.edu/events/reno-kriz-hltcoe-jhu-towards-a-practica lly-useful-text-simplification-system/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,Kriz\,October END:VEVENT BEGIN:VEVENT UID:ai1ec-21023@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nSpeech data is notoriously difficult to work with due to a variety of codecs\, length s of recordings\, and meta-data formats. We present Lhotse\, a speech data representation library that draws upon lessons learned from Kaldi speech recognition toolkit and brings its concepts into the modern deep learning ecosystem. Lhotse provides a common JSON description format with correspon ding Python classes and data preparation recipes for over 30 popular speec h corpora. Various datasets can be easily combined together and re-purpose d for different tasks. The library handles multi-channel recordings\, long recordings\, local and cloud storage\, lazy and on-the-fly operations amo ngst other features. We introduce Cut and CutSet concepts\, which simplify common data wrangling tasks for audio and help incorporate acoustic conte xt of speech utterances. Finally\, we show how Lhotse leverages PyTorch da ta API abstractions and adopts them to handle speech data for deep learnin g.
\nBiography
\nPiotr Zelasko is an a ssistant research scientist in the Center for Language and Speech Processi ng (CLSP) who specializes in automatic speech recognition (ASR) and spoken language understanding (SLU). His current research focuses on applying mu ltilingual and crosslingual speech recognition systems to categorize the p honetic inventory of a previously unknown language and on improving defens es against adversarial attacks on both speaker identification and automati c speech recognition systems. He is also addressing the question of how to structure a spontaneous conversation into high-level semantic units such as dialog acts or topics. Finally\, he is working on Lhotse + K2\, the nex t-generation speech processing research software ecosystem. Before joining Johns Hopkins\, Zelasko worked as a machine learning consultant for Avaya (2017-2019)\, and as a machine learning engineer for Techmo (2015-2017). Zelasko received his PhD (2019) in electronics engineering\, as well as hi s master’s (2014) and undergraduate degrees (2013) in acoustic engineering from AGH University of Science and Technology in Kraków\, Poland.
DTSTART;TZID=America/New_York:20211029T120000 DTEND;TZID=America/New_York:20211029T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore MD 21218 SEQUENCE:0 SUMMARY:Piotr Zelasko (CLSP at JHU) “Lhotse: a speech data representation l ibrary for the modern deep learning ecosystem” URL:https://www.clsp.jhu.edu/events/piotr-zelasko-clsp-at-jhu-lhotse-a-spee ch-data-representation-library-for-the-modern-deep-learning-ecosystem/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,October\,Zelasko END:VEVENT BEGIN:VEVENT UID:ai1ec-21031@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nMost p eople take for granted that when they speak\, they will be heard and under stood. But for the millions who live with speech impairments caused by phy sical or neurological conditions\, trying to communicate with others can b e difficult and lead to frustration. While there have been a great number of recent advances in Automatic Speech Recognition (ASR) technologies\, th ese interfaces can be inaccessible for those with speech impairments.
\nIn this talk\, we will present Parrotron\, an end -to-end-trained speech-to-speech conversion model that maps an input spect rogram directly to another spectrogram\, without utilizing any intermediat e discrete representation. The system is also trained to emit words in add ition to a spectrogram\, in parallel. We demonstrate that this model can be trained to normalize speech from any speaker regardless of accent\, pr osody\, and background noise\, into the voice of a single canonical target speaker with a fixed accent and consistent articulation and prosody. We f urther show that this normalization model can be adapted to normalize high ly atypical speech from speakers with a variety of speech impairments (due to\, ALS\, Cerebral-Palsy\, Deafness\, Stroke\, Brain Injury\, etc.) \, resulting in significant improvements in intelligibility and naturalness\, measured via a speech recognizer and listening tests. Finally\, demonstra ting the utility of this model on other speech tasks\, we show that the sa me model architecture can be trained to perform a speech separation task.< /p>\n
Dimitri will give a brief description of some key moments in development of speech recognition algorithms that he was in volved in and their applications to YouTube closed captions\, Live Transc ribe and wearable subtitles.
\nFadi will then sp eak about the development of Parrotron.
\nBiographies
\nDimitri Kanevsky started his career at Google working on speech recognition algorithms. Prior to joining Google\, Dimitr i was a Research staff member in the Speech Algorithms Department at IBM . Prior to IBM\, he worked at a number of centers for higher mathematics\, including Max Planck Institute in Germany and the Institute for Advanced Studies in Princeton. He currently holds 295 US patents and was Master Inv entor at IBM. MIT Technology Review recognized Dimitri conversational biom etrics based security patent as one of five most influential patents for 2 003. In 2012 Dimitri was honored at the White House as a Champion of Chang e for his efforts to advance access to science\, technology\, engineering\ , and math.
\nFadi Biadsy is a senior staff researc h scientist at Google NY for the past ten years. He has been exploring and leading multiple projects at Google\, including speech recognition\, spee ch conversion\, language modeling\, and semantic understanding. He receiv ed his PhD from Columbia University in 2011. At Columbia\, he researched a variety of speech and language processing projects including\, dialect an d accent recognition\, speech recognition\, charismatic speech and questio n answering. He holds a BSc and MSc in mathematics and computer science. He worked on handwriting recognition during his masters degree and he work ed as a senior software developer for five years at Dalet digital media sy stems building multimedia broadcasting systems.
DTSTART;TZID=America/New_York:20211105T120000 DTEND;TZID=America/New_York:20211105T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Fadi Biadsy and Dimitri Kanevsky (Google) “Speech Recognition: From Speaker Dependent to Speaker Independent to Full Personalization” “Parrot ron: A Unified E2E Speech-to Speech Conversion and ASR Model for Atypical Speech” URL:https://www.clsp.jhu.edu/events/fadi-biadsy-and-dimitri-kanevsky-google / X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,Biadsy and Kanevsky\,November END:VEVENT BEGIN:VEVENT UID:ai1ec-21041@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nNarration is a universal h uman practice that serves as a key site of education\, collective memory\, fostering social belief systems\, and furthering human creativity. Recent studies in economics (Shiller\, 2020)\, climate science (Bushell et al.\, 2017)\, political polarization (Kubin et al.\, 2021)\, and mental health (Adler et al.\, 2016) suggest an emerging interdisciplinary consensus that narrative is a central concept for understanding human behavior and belie fs. For close to half a century\, the field of narratology has developed a rich set of theoretical frameworks for understanding narrative. And yet t hese theories have largely gone untested on large\, heterogenous collectio ns of texts. Scholars continue to generate schemas by extrapolating from s mall numbers of manually observed documents. In this talk\, I will discuss how we can use machine learning to develop data-driven theories of narrat ion to better understand what Labov and Waletzky called “the simplest and most fundamental narrative structures.” How can machine learning help us a pproach what we might call a minimal theory of narrativity?
\nAndrew Piper is Professor and William Dawson Scholar in the Department of Languages\, Literatures\, and Cultures at McGill University. He is the director of _.t xtlab
\n\na laboratory for cultural analytics\, and editor of the /Journal of Cultural Analytics/\, an open-access journal dedicated to the computational study of culture. He is the author of numerous books and articles on the relatio nship of technology and reading\, including /Book Was There: Reading in El ectronic Times/(Chicago 2012)\, /Enumerations: Data and Literary Study/(Ch icago 2018)\, and most recently\, /Can We Be Wrong? The Problem of Textual Evidence in a Time of Data/(Cambridge 2020).
DTSTART;TZID=America/New_York:20211112T120000 DTEND;TZID=America/New_York:20211112T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Andrew Piper (McGill University) ” How can we use machine learning to understand narration?” URL:https://www.clsp.jhu.edu/events/andrew-piper-mcgill-university-how-can- we-use-machine-learning-to-understand-narration/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,November\,Piper END:VEVENT BEGIN:VEVENT UID:ai1ec-21057@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThis talk will outline the major challenging in porting mainstream speech technology to the domain o f clinical applications\; in particular\, the need for personalised system s\, the challenge of working in an inherently sparse data domain and devel oping meaningful collaborations with all stakeholders. The talk will give an overview of recent state-of-the-art research from current projects incl uding in the areas of recognition of disordered speech\, automatic process ing of conversations and the automatic detection and tracking of paralingu istic information at the University of Sheffield (UK)’s Speech and Hearing (SPandH) & Healthcare lab.
\nBiography
\nHei di is a Senior Lecturer (associate professor) in Computer Science at the U niversity of Sheffield\, United Kingdom. Her research interests are on the application of AI-based voice technologies to healthcare. In particular\, the detection and monitoring of people’s physical and mental health inclu ding verbal and non-verbal traits for expressions of emotion\, anxiety\, d epression and neurodegenerative conditions in e.g.\, therapeutic or diagno stic settings.
DTSTART;TZID=America/New_York:20211119T120000 DTEND;TZID=America/New_York:20211119T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Heidi Christensen (University of Sheffield\, UK) Virtual Seminar “A utomated Processing of Pathological Speech: Recent Work and Ongoing Challe nges” URL:https://www.clsp.jhu.edu/events/heidi-christensen-university-of-sheffie ld-uk-virtual-seminar-automated-processing-of-pathological-speech-recent-w ork-and-ongoing-challenges/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,Christensen\,November END:VEVENT BEGIN:VEVENT UID:ai1ec-21068@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20211203T120000 DTEND;TZID=America/New_York:20211203T131500 LOCATION:Hackerman HallB17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Eric Ringger (Zillow Group) URL:https://www.clsp.jhu.edu/events/eric-ringger-zillow-group/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,December\,Ringger END:VEVENT BEGIN:VEVENT UID:ai1ec-21072@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAbstract
\nHow important are different temporal s peech modulations for speech recognition? We answer this question from two complementary perspectives. Firstly\, we quantify the amount of phonetic information in the modulation spectrum of speech by computing the mutual i nformation between temporal modulations with frame-wise phoneme labels. Lo oking from another perspective\, we ask – which speech modulations an Auto matic Speech Recognition (ASR) system prefers for its operation. Data-driv en weights are learned over the modulation spectrum and optimized for an e nd-to-end ASR task. Both methods unanimously agree that speech information is mostly contained in slow modulation. Maximum mutual information occurs around 3-6 Hz which also happens to be the range of modulations most pref erred by the ASR. In addition\, we show that the incorporation of this kno wledge into ASRs significantly reduces their dependency on the amount of t raining data.
\n\n
Learning How to Play With The Machines: Taking Stock of Wher e the Collaboration Between Computational and Social Science Stands
\n< p> \nSpeakers: Jeff Gill\, Ernesto Calvo\, Hale Sirin and Antonios Anastasopoulos
DTSTART;TZID=America/New_York:20230407T120000 DTEND;TZID=America/New_York:20230407T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street SEQUENCE:0 SUMMARY:JHU CLSP APSA Roundtable on Learning How to Play with the Machines URL:https://www.clsp.jhu.edu/events/jhu-clsp-apsa-roundtable-on-learning-ho w-to-play-with-the-machines/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,APSA Roundtable END:VEVENT BEGIN:VEVENT UID:ai1ec-23586@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20230410T120000 DTEND;TZID=America/New_York:20230410T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Ruizhe Huang URL:https://www.clsp.jhu.edu/events/student-seminar-ruizhe-huang/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Huang END:VEVENT BEGIN:VEVENT UID:ai1ec-23588@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAdvanc es in open domain Large Language Models (LLMs) starting with BERT and more recently with GPT-4\, PaLM\, and LLaMA have facilitated dramatic improvem ents in conversational systems. These improvements include an unprecedente d breadth of conversational interactions between humans and machines while maintaining and sometimes surpassing the accuracy of systems trained spec ifically for known\, closed domains. However\, many applications still req uire higher levels of accuracy than pre-trained LLMs can provide. There ar e many studies underway to accomplish this. Broadly speaking\, the methods assume the pre-trained models are fixed (due to cost/time)\, and instead look to various augmentation methods including prompting strategies and mo del adaptation/fine-tuning.
\nOne augmentation s trategy leverages the context of the conversation. For example\, who are t he participants and what is known about these individuals (personal contex t)\, what was just said (dialogue context)\, where is the conversation tak ing place (geo context)\, what time of day and season is it (time context) \, etc. A powerful form of context is the shared visual setting of the co nversation between the human(s) and machine. The shared visual scene may b e from a device (phone\, smart glasses) or represented on a screen (browse r\, maps\, etc.) The elements in the visual context can be exploited by gr ounding the natural language conversational interaction\, thereby changing the priors of certain concepts and increasing the accuracy of the system. In this talk\, I will present some of my historical work in this area as well as my recent work in the AI Virtual Assistant (AVA) Lab at Georgia Te ch.
\nBio
\nDr. Larry Hec k is a Professor with a joint appointment in the School of Electrical and Computer Engineering and the School of Interactive Computing at the Georgi a Institute of Technology. He holds the Rhesa S. Farmer Distinguished Chai r of Advanced Computing Concepts and is a Georgia Research Alliance Eminen t Scholar. His received the BSEE from Texas Tech University (1986)\, and M SEE and PhD EE from the Georgia Institute of Technology (1989\,1991). He i s a Fellow of the IEEE\, inducted into the Academy of Distinguished Engine ering Alumni at Georgia Tech and received the Distinguished Engineer Award from the Texas Tech University Whitacre College of Engineering. He was a Senior Research Engineer with SRI (1992-98)\, Vice President of R&D at Nua nce (1998-2005)\, Vice President of Search and Advertising Sciences at Yah oo! (2005-2009)\, Chief Scientist of the Microsoft Speech products and Dis tinguished Engineer in Microsoft Research (2009-2014)\, Principal Scientis t with Google Research (2014-2017)\, and CEO of Viv Labs and SVP at Samsun g (2017-2021).
\n\n
Abstract
\nOur models achieve state-of-the-art performance and lay im portant groundwork towards realizing a universal translation system. At th e same time\, we keep making open-source contributions for everyone to kee p advancing the research for the languages they care about.
\nPaco is Research Scientist Manager supporting trans lation teams in Meta AI (FAIR). He works in the field of machine translati on with a focus on low-resource translation (e.g. NLLB\, FLORES) and the a im to break language barriers. He joined Meta in 2016. His research has be en published in top-tier NLP venues like ACL\, EMNLP. He was the co-chair of the Research director at AMTA (2020-2022). He has ave organized several research competitions focused on low-resource translation and data filter ing. Paco obtained his PhD from the ITESM in Mexico\, was a visiting schol ar at the LTI-CMU from 2008-2009\, and participated in DARPA’s GALE evalua tion program. Paco was a post-doc and scientist at Qatar Computing Researc h Institute in Qatar in 2012-2016
DTSTART;TZID=America/New_York:20230417T120000 DTEND;TZID=America/New_York:20230417T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Paco Guzman (Meta AI) “Building a Universal Translation System to B reak Down Language Barriers” URL:https://www.clsp.jhu.edu/events/paco-guzman-meta-ai/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Guzman END:VEVENT BEGIN:VEVENT UID:ai1ec-23592@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nLarge language models (LLM s) have ushered in exciting capabilities in language understanding and tex t generation\, with systems like ChatGPT holding fluent dialogs with users and being almost indistinguishable from humans. While this has obviously raised conversational systems and chatbots to a new level\, it also presen ts exciting new opportunities for building artificial agents with improved decision making capabilities. Specifically\, the ability to reason with l anguage can allow us to build agents that can 1) execute complex action se quences to effect change in the world\, 2) learn new skills by ‘reading’ i n addition to ‘doing’\, and 3) allow for easier personalization and contro l over their behavior. In this talk\, I will demonstrate how we can build such language-enabled agents that exhibit the above traits across various use cases such as multi-hop question answering\, web interaction\, and rob otic tool manipulation. In the end\, I will also discuss some dangers of u sing these LLM-based systems and some challenges that lie ahead in ensurin g their safe use.
\nBiography
\nKarthi k Narasimhan is an assistant professor in the Computer Science department at Princeton University and a co-Director of the Princeton NLP group. His research spans the areas of natural language processing and reinforcement learning\, with the goal of building intelligent agents that learn to oper ate in the world through both their own experience (”doing things”) and le veraging existing human knowledge (”reading about things”). Karthik receiv ed his PhD from MIT in 2017\, and spent a year as a visiting research scie ntist at OpenAI contributing to the GPT language model\, prior to joining Princeton in 2018. His research has been recognized by the NSF CAREER\, a Google Research Scholar Award\, an Amazon research award (2019)\, Bell Lab s runner-up prize and outstanding paper awards at EMNLP (2015\, 2016) and NeurIPS (2022).
DTSTART;TZID=America/New_York:20230421T120000 DTEND;TZID=America/New_York:20230421T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Karthik Narasimhan (Princeton University) ” Towards General-Purpose Language-Enabled Agents: Machines that can Read\, Think and Act” URL:https://www.clsp.jhu.edu/events/karthik-narasimhan-princeton-university / X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Narasimhan END:VEVENT BEGIN:VEVENT UID:ai1ec-23606@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20230424T120000 DTEND;TZID=America/New_York:20230424T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Brian Lu URL:https://www.clsp.jhu.edu/events/student-seminar-brian-lu/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Lu END:VEVENT BEGIN:VEVENT UID:ai1ec-23608@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAutomated analysis of stud ent writing has the potential to provide alternatives to selected-response questions such as multiple choice\, and to enable teachers and instructor s to assess students’ reasoning skills based on their long-form writing. F urther\, automated support to assess both short answers and long passages could provide students with a smoother trajectory towards mastery of writt en communication. Our methods focus on the specific ideas students expres s to support formative assessment through different kinds of feedback\, wh ich aims to scaffold their abilities to reason and communicate. In this ta lk I review our work in the PSU NLP lab on methods for automated assessmen t of different forms of student writing\, from younger and older students. I will briefly illustrate highly curated datasets created in collaborati on with researchers in STEM education\, results from deployment of an olde r content analysis tool on middle school physics essays\, and very prelimi nary results on assessment of college students’ physics lab reports. I wi ll also present our current work on short answer assessment using a novel recurrent relation network that incorporates contrastive learning.
\nBio
\nBecky Passonneau has been a Professor in the Department of Computer Science and Engineering at Penn State University s ince 2016\, when she joined as the first NLP researcher. Since that time t he NLP faculty has grown to include Rui Zhang and Wenpeng Yin. Becky’s res earch in natural language processing addresses computational pragmatics\, meaning the investigation of language as a system of interactive behavior that serves a wide range of purposes. She received her PhD in Linguistics from the University of Chicago in 1985\, and worked at several academic an d industry research labs before joining Penn State. Her work is reported i n over 140 publications in journals and refereed conference proceedings\, and has been funded through 27 sponsored projects from 16 sources\, inclu ding government agencies\, corporate sponsors\, corporate gifts\, and foun dations..
DTSTART;TZID=America/New_York:20230428T120000 DTEND;TZID=America/New_York:20230428T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Becky Passonneau (Penn State University) ” Automated Support to Sca ffold Students’ Short- and Long-form STEM Writing” URL:https://www.clsp.jhu.edu/events/becky-passonneau-penn-state-university- automated-support-to-scaffold-students-short-and-long-form-stem-writing/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Passonneau END:VEVENT BEGIN:VEVENT UID:ai1ec-23882@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nLarge language models (LLM s) have demonstrated incredible power\, but they also possess vulnerabilit ies that can lead to misuse and potential attacks. In this presentation\, we will address two fundamental questions regarding the responsible utiliz ation of LLMs: (1) How can we accurately identify AI-generated text? (2) W hat measures can safeguard the intellectual property of LLMs? We will intr oduce two recent watermarking techniques designed for text and models\, re spectively. Our discussion will encompass the theoretical underpinnings th at ensure the correctness of watermark detection\, along with robustness a gainst evasion attacks. Furthermore\, we will showcase empirical evidence validating their effectiveness. These findings establish a solid technical groundwork for policymakers\, legal professionals\, and generative AI pra ctitioners alike.
\nBiography
\nLei Li is an Assistant Professor in Language Technology Institute at Carnegie Mellon Un iversity. He received Ph.D. from Carnegie Mellon University School of Comp uter Science. He is a recipient of ACL 2021 Best Paper Award\, CCF Young E lite Award in 2019\, CCF distinguished speaker in 2017\, Wu Wen-tsün AI pr ize in 2017\, and 2012 ACM SIGKDD dissertation award (runner-up)\, and is recognized as Notable Area Chair of ICLR 2023. Previously\, he was a facul ty member at UC Santa Barbara. Prior to that\, he founded ByteDance AI La b in 2016 and led its research in NLP\, ML\, Robotics\, and Drug Discovery . He launched ByteDance’s machine translation system VolcTrans and AI writ ing system Xiaomingbot\, serving one billion users.
DTSTART;TZID=America/New_York:20230901T120000 DTEND;TZID=America/New_York:20230901T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Lei Li (Carnegie Mellon University) “Empowering Responsible Use of Large Language Models” URL:https://www.clsp.jhu.edu/events/lei-li-carnegie-mellon-university-empow ering-responsible-use-of-large-language-models/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Li\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-24491@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20240401T120000 DTEND;TZID=America/New_York:20240401T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Yuan Gong URL:https://www.clsp.jhu.edu/events/yuan-gong/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,April\,Gong END:VEVENT BEGIN:VEVENT UID:ai1ec-24507@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nHistory repeats itself\, s ometimes in a bad way. Preventing natural or man-made disasters requires b eing aware of these patterns and taking pre-emptive action to address and reduce them\, or ideally\, eliminate them. Emerging events\, such as the C OVID pandemic and the Ukraine Crisis\, require a time-sensitive comprehens ive understanding of the situation to allow for appropriate decision-makin g and effective action response. Automated generation of situation reports can significantly reduce the time\, effort\, and cost for domain experts when preparing their official human-curated reports. However\, AI research toward this goal has been very limited\, and no successful trials have ye t been conducted to automate such report generation and “what-if” disaster forecasting. Pre-existing natural language processing and information ret rieval techniques are insufficient to identify\, locate\, and summarize im portant information\, and lack detailed\, structured\, and strategic aware ness. In this talk I will present SmartBook\, a novel framework that canno t be solved by large language models alone\, to consume large volumes of m ultimodal multilingual news data and produce a structured situation report with multiple hypotheses (claims) summarized and grounded with rich links to factual evidence through multimodal knowledge extraction\, claim detec tion\, fact checking\, misinformation detection and factual error correcti on. Furthermore\, SmartBook can also serve as a novel news event simulator \, or an intelligent prophetess. Given “What-if” conditions and dimension s elicited from a domain expert user concerning a disaster scenario\, Smar tBook will induce schemas from historical events\, and automatically gener ate a complex event graph along with a timeline of news articles that desc ribe new simulated events and character-centric stories based on a new Λ-s haped attention mask that can generate text with infinite length. By effec tively simulating disaster scenarios in both event graph and natural langu age format\, we expect SmartBook will greatly assist humanitarian workers and policymakers to exercise reality checks\, and thus better prevent and respond to future disasters.
\nBio
\nHeng Ji is a professor at Computer Science Department\, and an affiliated faculty member at Electrical and Computer Engineering Department and Coordinated S cience Laboratory of University of Illinois Urbana-Champaign. She is an Am azon Scholar. She is the Founding Director of Amazon-Illinois Center on AI for Interactive Conversational Experiences (AICE). She received her B.A. and M. A. in Computational Linguistics from Tsinghua University\, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing\, especially on Multimedia Multilingual Information Extraction\, Knowledge-enhanced Large Language Mo dels\, Knowledge-driven Generation and Conversational AI. She was selected as a Young Scientist to attend the 6th World Laureates Association Forum\ , and selected to participate in DARPA AI Forward in 2023. She was selecte d as “Young Scientist” and a member of the Global Future Council on the Fu ture of Computing by the World Economic Forum in 2016 and 2017. The awards she received include Women Leaders of Conversational AI (Class of 2023) b y Project Voice\, “AI’s 10 to Watch” Award by IEEE Intelligent Systems in 2013\, NSF CAREER award in 2009\, PACLIC2012 Best paper runner-up\, “Best of ICDM2013” paper award\, “Best of SDM2013” paper award\, ACL2018 Best De mo paper nomination\, ACL2020 Best Demo Paper Award\, NAACL2021 Best Demo Paper Award\, Google Research Award in 2009 and 2014\, IBM Watson Faculty Award in 2012 and 2014 and Bosch Research Award in 2014-2018. She was invi ted to testify to the U.S. House Cybersecurity\, Data Analytics\, & IT Com mittee as an AI expert in 2023. She was invited by the Secretary of the U. S. Air Force and AFRL to join Air Force Data Analytics Expert Panel to inf orm the Air Force Strategy 2030\, and invited to speak at the Federal Info rmation Integrity R&D Interagency Working Group (IIRD IWG) briefing in 202 3. She is the lead of many multi-institution projects and tasks\, includin g the U.S. ARL projects on information fusion and knowledge networks const ruction\, DARPA ECOLE MIRACLE team\, DARPA KAIROS RESIN team and DARPA DEF T Tinker Bell team. She has coordinated the NIST TAC Knowledge Base Popula tion task 2010-2022. She was the associate editor for IEEE/ACM Transaction on Audio\, Speech\, and Language Processing\, and served as the Program C ommittee Co-Chair of many conferences including NAACL-HLT2018 and AACL-IJC NLP2022. She is elected as the North American Chapter of the Association f or Computational Linguistics (NAACL) secretary 2020-2023. Her research has been widely supported by the U.S. government agencies (DARPA\, NSF\, DoE\ , ARL\, IARPA\, AFRL\, DHS) and industry (Apple\, Amazon\, Google\, Facebo ok\, Bosch\, IBM\, Disney).
DTSTART;TZID=America/New_York:20240405T120000 DTEND;TZID=America/New_York:20240405T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, Maryland 21218 SEQUENCE:0 SUMMARY:Heng Ji (University of Illinois Urbana-Champaign) “SmartBook: an AI Prophetess for Disaster Reporting and Forecasting” URL:https://www.clsp.jhu.edu/events/heng-ji-university-of-illinois-urbana-c hampaign-smartbook-an-ai-prophetess-for-disaster-reporting-and-forecasting / X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,April\,Ji END:VEVENT BEGIN:VEVENT UID:ai1ec-24509@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20240408T120000 DTEND;TZID=America/New_York:20240408T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Berrak Sisman URL:https://www.clsp.jhu.edu/events/berrak-sisman/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,April\,Sisman END:VEVENT BEGIN:VEVENT UID:ai1ec-24511@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20240412T120000 DTEND;TZID=America/New_York:20240412T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Sonal Joshi (JHU) URL:https://www.clsp.jhu.edu/events/sonal-joshi-jhu/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,April\,Joshi END:VEVENT BEGIN:VEVENT UID:ai1ec-24515@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20240415T120000 DTEND;TZID=America/New_York:20240415T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Matthew Wipperman (Regeneron) URL:https://www.clsp.jhu.edu/events/matthew-wipperman-regeneron/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,April\,Wipperman END:VEVENT BEGIN:VEVENT UID:ai1ec-24517@www.clsp.jhu.edu DTSTAMP:20240329T160132Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nLarge Language Models (LLM s) have become ubiquitous across various domains\, transforming the way we interact with information and conduct research. While the proliferation o f LLMs has enhanced numerous applications\, a significant number of high-p erforming models remain proprietary\, impeding the progress of scientific exploration. LLMs are also susceptible to hallucinations\, generating seem ingly credible yet factually inaccurate information that can impact their broad acceptance and integration. In this seminar\, I will commence by int roducing one of our open-sourced XGen LLMs. I will delve into its pre-trai ning process and present its results on standard benchmarks. Subsequently\ , I will discuss our work involving reasoning with LLMs\, democratizing th em for low-resource languages\, and distilling knowledge from a larger (17 5B) proprietary LLM to a smaller (7B) model in a personalized manner. Fina lly\, I will conclude by addressing some limitations of LLMs\, emphasizing that scaling alone might not suffice as a solution and that new innovatio ns are needed to tackle these challenges.
\nBio
\nDr. Sha fiq Joty (https://raihanjoty.github.io/) is curr ently a Research Director at Salesforce Research (Palo Alto\, USA)\, where he oversees the NLP group’s efforts in large language modeling (LLM) and generative AI. He also holds the position of a tenured Associate Professor (currently on leave) in the School of Computer Science and Engineering (S CSE) at NTU\, Singapore. He was a founding manager of the Salesforce Resea rch Asia (Singapore) lab. His research has contributed to over 30+ patents and 140+ papers in top-tier NLP and ML conferences and journals. He has s erved as the Program Chair of SIGDIAL-2023\, as a member of the best paper award committees for ICLR-23 and NAACL-22\, and in the capacity of a (sen ior) area chair for many of the leading NLP and ML conferences.
\n