BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-20117@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nNeural sequence generation systems oftentimes generate sequences by searching for the most likely se quence under the learnt probability distribution. This assumes that the mo st likely sequence\, i.e. the mode\, under such a model must also be the b est sequence it has to offer (often in a given context\, e.g. conditioned on a source sentence in translation). Recent findings in neural machine tr anslation (NMT) show that the true most likely sequence oftentimes is empt y under many state-of-the-art NMT models. This follows a large list of oth er pathologies and biases observed in NMT and other sequence generation mo dels: a length bias\, larger beams degrading performance\, exposure bias\, and many more. Many of these works blame the probabilistic formulation of NMT or maximum likelihood estimation. We provide a different view on this : it is mode-seeking search\, e.g. beam search\, that introduces many of t hese pathologies and biases\, and such a decision rule is not suitable for the type of distributions learnt by NMT systems. We show that NMT models spread probability mass over many translations\, and that the most likely translation oftentimes is a rare event. We further show that translation d istributions do capture important aspects of translation well in expectati on. Therefore\, we advocate for decision rules that take into account the entire probability distribution and not just its mode. We provide one exam ple of such a decision rule\, and show that this is a fruitful research di rection.
\nBiography
\nI am an assistant professor (UD) in natural language processing at the Institute for Logic\, Language and Computation where I lead the Probabilistic Language L earning group.
\nMy work concerns the design of models and algor ithms that learn to represent\, understand\, and generate language data. E xamples of specific problems I am interested in include language modelling \, machine translation\, syntactic parsing\, textual entailment\, text cla ssification\, and question answering.
\nI also develop techniques to approach general machine learning problems such as probabilistic inferenc e\, gradient and density estimation.
\nMy interests sit at the inter section of disciplines such as statistics\, machine learning\, approximate inference\, global optimization\, formal languages\, and computational li nguistics.
\n\n
DTSTART;TZID=America/New_York:20210419T120000 DTEND;TZID=America/New_York:20210419T131500 LOCATION:via Zoom SEQUENCE:0 SUMMARY:Wilker Aziz (University of Amsterdam) “The Inadequacy of the Mode in Neural Machine Translation” URL:https://www.clsp.jhu.edu/events/wilker-aziz-university-of-amsterdam/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,April\,Aziz END:VEVENT BEGIN:VEVENT UID:ai1ec-20120@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nRobotics@Google’s mission is to make robots useful in the real world through machine learning. We a re excited about a new model for robotics\, designed for generalization ac ross diverse environments and instructions. This model is focused on scala ble data-driven learning\, which is task-agnostic\, leverages simulation\, learns from past experience\, and can be quickly adapted to work in the r eal-world through limited interactions. In this talk\, we’ll share some of our recent work in this direction in both manipulation and locomotion app lications.
\nBiography
\nCarolina
Abstract
\nOver the last few years\, deep neural models have taken over the field of natural language processin g (NLP)\, brandishing great improvements on many of its sequence-level tas ks. But the end-to-end nature of these models makes it hard to figure out whether the way they represent individual words aligns with how language b uilds itself from the bottom up\, or how lexical changes in register and d omain can affect the untested aspects of such representations.
\nIn this talk\, I will present NYTWIT\, a dataset created to challenge large l anguage models at the lexical level\, tasking them with identification of processes leading to the formation of novel English words\, as well as wit h segmentation and recovery of the specific subclass of novel blends. I wi ll then present XRayEmb\, a method which alleviates the hardships of proce ssing these novelties by fitting a character-level encoder to the existing models’ subword tokenizers\; and conclude with a discussion of the drawba cks of current tokenizers’ vocabulary creation schemes.
\nBi ography
\nYuval Pinter is a Senior Lecturer in the Department of Comp uter Science at Ben-Gurion University of the Negev\, focusing on natural l anguage processing. Yuval got his PhD at t he Georgia Institute of Technology School of Interactive Computing as a Bl oomberg Data Science PhD Fellow. Before that\, he worked as a Research Eng ineer at Yahoo Labs and as a Computational Linguist at Ginger Software\, a nd obtained an MA in Linguistics and a BSc in CS and Mathematics\, both fr om Tel Aviv University. Yuval blogs (in He brew) about language matters on Dagesh Kal.
DTSTART;TZID=America/New_York:20210910T120000 DTEND;TZID=America/New_York:20210910T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD SEQUENCE:0 SUMMARY:Yuval Pinter (Ben-Gurion University – Virtual Visit) “Challenging a nd Adapting NLP Models to Lexical Phenomena” URL:https://www.clsp.jhu.edu/events/yuval-pinter/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,Pinter\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-20723@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nText simplification aims t o help audiences read and understand a piece of text through lexical\, syn tactic\, and discourse modifications\, while remaining faithful to its cen tral idea and meaning. Thanks to large-scale parallel corpora derived from Wikipedia and News\, much of modern-day text simplification research focu ses on sentence simplification\, transforming original\, more complex sent ences into simplified versions. In this talk\, I present new frontiers tha t focus on discourse operations. First\, we consider the challenging task of simplifying highly technical language\, in our case\, medical texts. We introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinic al topics. We then propose a new metric to quantify stylistic differentiat es between the two\, and models for paragraph-level simplification. Second \, we present the first data-driven study of inserting elaborations and ex planations during simplification\, and illustrate the richness and complex ities of this phenomenon.
\nBiography
\nAbstract
\nRaytheon BBN participated in the IARPA MATERIAL program\, whose objective is to enable rapid develop ment of language-independent methods for cross-lingual information retriev al (CLIR). The challenging CLIR task of retrieving documents written (or s poken) in one language so that they satisfy an information need expressed in a different language is exacerbated by unique challenges posed by the M ATERIAL program: limited training data for automatic speech recognition an d machine translation\, scant lexical resources\, non-standardized orthogr aphy\, etc. Furthermore\, the format of the queries and the “Query-Weighte d Value” performance measure are non-standard and not previously studied i n the IR community. In this talk\, we will describe the Raytheon BBN CLIR system\, which was successful at addressing the above challenges and uniqu e characteristics of the program.
\nBiography
\nDamianos Karakos has been at Raytheon BBN f or the past nine years\, where he is currently a Senior Principal Engineer \, Research. Before that\, he was research faculty at Johns Hopkins Univer sity. He has worked on several Government projects (e.g.\, DARPA GALE\, DA RPA RATS\, IARPA BABEL\, IARPA MATERIAL\, IARPA BETTER) and on a variety o f HLT-related topics (e.g.\, speech recognition\, speech activity detectio n\, keyword search\, information retrieval). He has published more than 60 peer-reviewed papers. His research interests lie at the intersection of h uman language technology and machine learning\, with an emphasis on statis tical methods. He obtained a PhD in Electrical Engineering from the Univer sity of Maryland\, College Park\, in 2002.
\n\n
Abstract
\nIn recent years\, the fiel d of Natural Language Processing has seen a profusion of tasks\, datasets\ , and systems that facilitate reasoning about real-world situations throug h language (e.g.\, RTE\, MNLI\, COMET). Such systems might\, for example\, be trained to consider a situation where “somebody dropped a glass on the floor\,” and conclude it is likely that “the glass shattered” as a result . In this talk\, I will discuss three pieces of work that revisit assumpti ons made by or about these systems. In the first work\, I develop a Defeas ible Inference task\, which enables a system to recognize when a prior ass umption it has made may no longer be true in light of new evidence it rece ives. The second work I will discuss revisits partial-input baselines\, wh ich have highlighted issues of spurious correlations in natural language r easoning datasets and led to unfavorable assumptions about models’ reasoni ng abilities. In particular\, I will discuss experiments that show models may still learn to reason in the presence of spurious dataset artifacts. F inally\, I will touch on work analyzing harmful assumptions made by reason ing models in the form of social stereotypes\, particularly in the case of free-form generative reasoning models.
\nBiography
\nRachel Rudinger is an Assistant Professor in the Department of Co mputer Science at the University of Maryland\, College Park. She holds joi nt appointments in the Department of Linguistics and the Institute for Adv anced Computer Studies (UMIACS). In 2019\, Rachel completed her Ph.D. in C omputer Science at Johns Hopkins University in the Center for Language and Speech Processing. From 2019-2020\, she was a Young Investigator at the A llen Institute for AI in Seattle\, and a visiting researcher at the Univer sity of Washington. Her research interests include computational semantics \, common-sense reasoning\, and issues of social bias and fairness in NLP.
DTSTART;TZID=America/New_York:20220916T120000 DTEND;TZID=America/New_York:20220916T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rachel Rudinger (University of Maryland\, College Park) “Not So Fas t!: Revisiting Assumptions in (and about) Natural Language Reasoning” URL:https://www.clsp.jhu.edu/events/rachel-rudinger-university-of-maryland- college-park-not-so-fast-revisiting-assumptions-in-and-about-natural-langu age-reasoning/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Rudinger\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-22375@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nI will present our work on data augmentation using style transfer as a way to im prove domain adaptation in sequence labeling tasks. The target domain is s ocial media data\, and the task is named entity recognition (NER). The pre mise is that we can transform the labelled out of domain data into somethi ng that stylistically is more closely related to the target data. Then we can train a model on a combination of the generated data and the smaller a mount of in domain data to improve NER prediction performance. I will show recent empirical results on these efforts.
\nIf time allows\, I will also give an overview of other research projects I’m currently leading at RiTUAL (Research in Text Understanding and Analysis of Language) lab. The common thread among all these research problems is t he scarcity of labeled data.
\nBiography
\nThamar Solorio is a Professor of Com puter Science at the University of Houston (UH). She holds graduate degree s in Computer Science from the Instituto Nacional de Astrofísica\, Óptica y Electrónica\, in Puebla\, Mexico. Her research interests include informa tion extraction from social media data\, enabling technology for code-swit ched data\, stylistic modeling of text\, and more recently multimodal appr oaches for online content understanding. She is the director and founder o f the RiTUAL Lab at UH. She is the recipient of an NSF CAREER award for he r work on authorship attribution\, and recipient of the 2014 Emerging Lead er ABIE Award in Honor of Denice Denton. She is currently serving a second term as an elected board member of the North American Chapter of the Asso ciation of Computational Linguistics and was PC co-chair for NAACL 2019. S he recently joined the team of Editors in Chief for the ACL Rolling Review (ARR) system. Her research is currently funded by the NSF and by ADOBE. p> DTSTART;TZID=America/New_York:20220923T120000 DTEND;TZID=America/New_York:20220923T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Thamar Solorio (University of Houston) “Style Transfer for Data Aug mentation in Sequence Labeling Tasks” URL:https://www.clsp.jhu.edu/events/thamar-solorio-university-of-houston-st yle-transfer-for-data-augmentation-in-sequence-labeling-tasks/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,September\,Solorio END:VEVENT BEGIN:VEVENT UID:ai1ec-22380@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nThe availability of large multilingual pre-trained language models has opened up exciting pathways f or developing NLP technologies for languages with scarce resources. In thi s talk I will advocate for the need to go beyond the most common languages in multilingual evaluation\, and on the challenges of handling new\, unse en-during-training languages and varieties. I will also share some of my e xperiences with working with indigenous and other endangered language comm unities and activists.
\nBiography
\nAntonios Anastasopoulos is an As sistant Professor in Computer Science at George Mason University. In 2019\ , Antonis received his PhD in Computer Science from the University of Notr e Dame and then worked as a postdoctoral researcher at the Language Techno logies Institute at Carnegie Mellon University. His research interests rev olve around computational linguistics and natural language processing with a focus on low-resource settings\, endangered languages\, and cross-lingu al learning.
\nDTSTART;TZID=America/New_York:20220930T120000 DTEND;TZID=America/New_York:20220930T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Antonios Anastasopoulos (George Mason University) “NLP Beyond the T op-100 Languages” URL:https://www.clsp.jhu.edu/events/antonis-anastasopoulos-george-mason-uni versity/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Anastasopoulos\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23515@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:
Abstract
\nHow important are different temporal s peech modulations for speech recognition? We answer this question from two complementary perspectives. Firstly\, we quantify the amount of phonetic information in the modulation spectrum of speech by computing the mutual i nformation between temporal modulations with frame-wise phoneme labels. Lo oking from another perspective\, we ask – which speech modulations an Auto matic Speech Recognition (ASR) system prefers for its operation. Data-driv en weights are learned over the modulation spectrum and optimized for an e nd-to-end ASR task. Both methods unanimously agree that speech information is mostly contained in slow modulation. Maximum mutual information occurs around 3-6 Hz which also happens to be the range of modulations most pref erred by the ASR. In addition\, we show that the incorporation of this kno wledge into ASRs significantly reduces their dependency on the amount of t raining data.
\n\n
Learning How to Play With The Machines: Taking Stock of Wher e the Collaboration Between Computational and Social Science Stands
\n< p> \nSpeakers: Jeff Gill\, Ernesto Calvo\, Hale Sirin and Antonios Anastasopoulos
DTSTART;TZID=America/New_York:20230407T120000 DTEND;TZID=America/New_York:20230407T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street SEQUENCE:0 SUMMARY:JHU CLSP APSA Roundtable on Learning How to Play with the Machines URL:https://www.clsp.jhu.edu/events/jhu-clsp-apsa-roundtable-on-learning-ho w-to-play-with-the-machines/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,APSA Roundtable END:VEVENT BEGIN:VEVENT UID:ai1ec-23586@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20230410T120000 DTEND;TZID=America/New_York:20230410T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Ruizhe Huang URL:https://www.clsp.jhu.edu/events/student-seminar-ruizhe-huang/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Huang END:VEVENT BEGIN:VEVENT UID:ai1ec-23588@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAdvanc es in open domain Large Language Models (LLMs) starting with BERT and more recently with GPT-4\, PaLM\, and LLaMA have facilitated dramatic improvem ents in conversational systems. These improvements include an unprecedente d breadth of conversational interactions between humans and machines while maintaining and sometimes surpassing the accuracy of systems trained spec ifically for known\, closed domains. However\, many applications still req uire higher levels of accuracy than pre-trained LLMs can provide. There ar e many studies underway to accomplish this. Broadly speaking\, the methods assume the pre-trained models are fixed (due to cost/time)\, and instead look to various augmentation methods including prompting strategies and mo del adaptation/fine-tuning.
\nOne augmentation s trategy leverages the context of the conversation. For example\, who are t he participants and what is known about these individuals (personal contex t)\, what was just said (dialogue context)\, where is the conversation tak ing place (geo context)\, what time of day and season is it (time context) \, etc. A powerful form of context is the shared visual setting of the co nversation between the human(s) and machine. The shared visual scene may b e from a device (phone\, smart glasses) or represented on a screen (browse r\, maps\, etc.) The elements in the visual context can be exploited by gr ounding the natural language conversational interaction\, thereby changing the priors of certain concepts and increasing the accuracy of the system. In this talk\, I will present some of my historical work in this area as well as my recent work in the AI Virtual Assistant (AVA) Lab at Georgia Te ch.
\nBio
\nDr. Larry Hec k is a Professor with a joint appointment in the School of Electrical and Computer Engineering and the School of Interactive Computing at the Georgi a Institute of Technology. He holds the Rhesa S. Farmer Distinguished Chai r of Advanced Computing Concepts and is a Georgia Research Alliance Eminen t Scholar. His received the BSEE from Texas Tech University (1986)\, and M SEE and PhD EE from the Georgia Institute of Technology (1989\,1991). He i s a Fellow of the IEEE\, inducted into the Academy of Distinguished Engine ering Alumni at Georgia Tech and received the Distinguished Engineer Award from the Texas Tech University Whitacre College of Engineering. He was a Senior Research Engineer with SRI (1992-98)\, Vice President of R&D at Nua nce (1998-2005)\, Vice President of Search and Advertising Sciences at Yah oo! (2005-2009)\, Chief Scientist of the Microsoft Speech products and Dis tinguished Engineer in Microsoft Research (2009-2014)\, Principal Scientis t with Google Research (2014-2017)\, and CEO of Viv Labs and SVP at Samsun g (2017-2021).
\n\n
Abstract
\nOur models achieve state-of-the-art performance and lay im portant groundwork towards realizing a universal translation system. At th e same time\, we keep making open-source contributions for everyone to kee p advancing the research for the languages they care about.
\nPaco is Research Scientist Manager supporting trans lation teams in Meta AI (FAIR). He works in the field of machine translati on with a focus on low-resource translation (e.g. NLLB\, FLORES) and the a im to break language barriers. He joined Meta in 2016. His research has be en published in top-tier NLP venues like ACL\, EMNLP. He was the co-chair of the Research director at AMTA (2020-2022). He has ave organized several research competitions focused on low-resource translation and data filter ing. Paco obtained his PhD from the ITESM in Mexico\, was a visiting schol ar at the LTI-CMU from 2008-2009\, and participated in DARPA’s GALE evalua tion program. Paco was a post-doc and scientist at Qatar Computing Researc h Institute in Qatar in 2012-2016
DTSTART;TZID=America/New_York:20230417T120000 DTEND;TZID=America/New_York:20230417T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Paco Guzman (Meta AI) “Building a Universal Translation System to B reak Down Language Barriers” URL:https://www.clsp.jhu.edu/events/paco-guzman-meta-ai/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Guzman END:VEVENT BEGIN:VEVENT UID:ai1ec-23592@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nLarge language models (LLM s) have ushered in exciting capabilities in language understanding and tex t generation\, with systems like ChatGPT holding fluent dialogs with users and being almost indistinguishable from humans. While this has obviously raised conversational systems and chatbots to a new level\, it also presen ts exciting new opportunities for building artificial agents with improved decision making capabilities. Specifically\, the ability to reason with l anguage can allow us to build agents that can 1) execute complex action se quences to effect change in the world\, 2) learn new skills by ‘reading’ i n addition to ‘doing’\, and 3) allow for easier personalization and contro l over their behavior. In this talk\, I will demonstrate how we can build such language-enabled agents that exhibit the above traits across various use cases such as multi-hop question answering\, web interaction\, and rob otic tool manipulation. In the end\, I will also discuss some dangers of u sing these LLM-based systems and some challenges that lie ahead in ensurin g their safe use.
\nBiography
\nKarthi k Narasimhan is an assistant professor in the Computer Science department at Princeton University and a co-Director of the Princeton NLP group. His research spans the areas of natural language processing and reinforcement learning\, with the goal of building intelligent agents that learn to oper ate in the world through both their own experience (”doing things”) and le veraging existing human knowledge (”reading about things”). Karthik receiv ed his PhD from MIT in 2017\, and spent a year as a visiting research scie ntist at OpenAI contributing to the GPT language model\, prior to joining Princeton in 2018. His research has been recognized by the NSF CAREER\, a Google Research Scholar Award\, an Amazon research award (2019)\, Bell Lab s runner-up prize and outstanding paper awards at EMNLP (2015\, 2016) and NeurIPS (2022).
DTSTART;TZID=America/New_York:20230421T120000 DTEND;TZID=America/New_York:20230421T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Karthik Narasimhan (Princeton University) ” Towards General-Purpose Language-Enabled Agents: Machines that can Read\, Think and Act” URL:https://www.clsp.jhu.edu/events/karthik-narasimhan-princeton-university / X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Narasimhan END:VEVENT BEGIN:VEVENT UID:ai1ec-23606@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20230424T120000 DTEND;TZID=America/New_York:20230424T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Brian Lu URL:https://www.clsp.jhu.edu/events/student-seminar-brian-lu/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Lu END:VEVENT BEGIN:VEVENT UID:ai1ec-23608@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAutomated analysis of stud ent writing has the potential to provide alternatives to selected-response questions such as multiple choice\, and to enable teachers and instructor s to assess students’ reasoning skills based on their long-form writing. F urther\, automated support to assess both short answers and long passages could provide students with a smoother trajectory towards mastery of writt en communication. Our methods focus on the specific ideas students expres s to support formative assessment through different kinds of feedback\, wh ich aims to scaffold their abilities to reason and communicate. In this ta lk I review our work in the PSU NLP lab on methods for automated assessmen t of different forms of student writing\, from younger and older students. I will briefly illustrate highly curated datasets created in collaborati on with researchers in STEM education\, results from deployment of an olde r content analysis tool on middle school physics essays\, and very prelimi nary results on assessment of college students’ physics lab reports. I wi ll also present our current work on short answer assessment using a novel recurrent relation network that incorporates contrastive learning.
\nBio
\nBecky Passonneau has been a Professor in the Department of Computer Science and Engineering at Penn State University s ince 2016\, when she joined as the first NLP researcher. Since that time t he NLP faculty has grown to include Rui Zhang and Wenpeng Yin. Becky’s res earch in natural language processing addresses computational pragmatics\, meaning the investigation of language as a system of interactive behavior that serves a wide range of purposes. She received her PhD in Linguistics from the University of Chicago in 1985\, and worked at several academic an d industry research labs before joining Penn State. Her work is reported i n over 140 publications in journals and refereed conference proceedings\, and has been funded through 27 sponsored projects from 16 sources\, inclu ding government agencies\, corporate sponsors\, corporate gifts\, and foun dations..
DTSTART;TZID=America/New_York:20230428T120000 DTEND;TZID=America/New_York:20230428T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Becky Passonneau (Penn State University) ” Automated Support to Sca ffold Students’ Short- and Long-form STEM Writing” URL:https://www.clsp.jhu.edu/events/becky-passonneau-penn-state-university- automated-support-to-scaffold-students-short-and-long-form-stem-writing/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,April\,Passonneau END:VEVENT BEGIN:VEVENT UID:ai1ec-23882@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nLarge language models (LLM s) have demonstrated incredible power\, but they also possess vulnerabilit ies that can lead to misuse and potential attacks. In this presentation\, we will address two fundamental questions regarding the responsible utiliz ation of LLMs: (1) How can we accurately identify AI-generated text? (2) W hat measures can safeguard the intellectual property of LLMs? We will intr oduce two recent watermarking techniques designed for text and models\, re spectively. Our discussion will encompass the theoretical underpinnings th at ensure the correctness of watermark detection\, along with robustness a gainst evasion attacks. Furthermore\, we will showcase empirical evidence validating their effectiveness. These findings establish a solid technical groundwork for policymakers\, legal professionals\, and generative AI pra ctitioners alike.
\nBiography
\nLei Li is an Assistant Professor in Language Technology Institute at Carnegie Mellon Un iversity. He received Ph.D. from Carnegie Mellon University School of Comp uter Science. He is a recipient of ACL 2021 Best Paper Award\, CCF Young E lite Award in 2019\, CCF distinguished speaker in 2017\, Wu Wen-tsün AI pr ize in 2017\, and 2012 ACM SIGKDD dissertation award (runner-up)\, and is recognized as Notable Area Chair of ICLR 2023. Previously\, he was a facul ty member at UC Santa Barbara. Prior to that\, he founded ByteDance AI La b in 2016 and led its research in NLP\, ML\, Robotics\, and Drug Discovery . He launched ByteDance’s machine translation system VolcTrans and AI writ ing system Xiaomingbot\, serving one billion users.
DTSTART;TZID=America/New_York:20230901T120000 DTEND;TZID=America/New_York:20230901T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Lei Li (Carnegie Mellon University) “Empowering Responsible Use of Large Language Models” URL:https://www.clsp.jhu.edu/events/lei-li-carnegie-mellon-university-empow ering-responsible-use-of-large-language-models/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Li\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23886@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThe arms race to build inc reasingly larger\, powerful language models (LMs) in the past year has bee n remarkable. Yet incorporating LMs effectively into practical application s that facilitate manual workflows remains challenging. I will discuss LMs ’ limiting factors and our efforts to overcome them. I will start with cha llenges surrounding efficient and robust LM alignment. I will share insigh ts from our recent paper “Sel f-Instruct” (ACL 2023)\, where we used vanilla (unaligned) LMs for ali gning itself\, an approach that has yielded some success. Then\, I will mo ve on to the challenge of tracing the output of LMs to reliable sources\, a weakness that makes them prone to hallucinations. I will discuss our rec ent approach of ‘according-to’ prom pting\, which steers LMs to quote directly from sources observed in it s pre-training. If time permits\, I will discuss our ongoing project to ad apt LMs to interact with web pages. Throughout the presentation\, I will h ighlight our progress\, and end with questions about our future progress.< /p>\n
Biography
\nDaniel Khashabi is an assistant professor in computer science at Johns Hopkins University and the Center for Language and Speech Processing (CLSP) member. He is interested in bui lding reasoning-driven modular NLP systems that are robust\, transparent\, and communicative\, particularly those that use natural language as the c ommunication medium. Khashabi has published over 40 papers on natural lang uage processing and AI in top-tier venues. His work touches upon developin g. His research has won the ACL 2023 Outstanding Paper Award\, NAACL 2022 Best Paper Award\, research gifts from the Allen Institute for AI\, and an Amazon Research Award 2023. Before joining Hopkins\, he was a postdoctora l fellow at the Allen Institute for AI (2019-2022) and obtained a Ph.D. fr om the University of Pennsylvania in 2019.
DTSTART;TZID=America/New_York:20230908T120000 DTEND;TZID=America/New_York:20230908T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Daniel Khashabi (Johns Hopkins University) “Building More Helpful L anguage Models” URL:https://www.clsp.jhu.edu/events/daniel-khashabi-johns-hopkins-universit y/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Khashabi\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23888@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nEmbedding text sequences is a widespread requirement in modern lan guage understanding. Existing approaches focus largely on constant-size re presentations. This is problematic\, as the amount of information containe d in text often varies with the length of the input. We propose a solution called Nugget\, which encodes language into a representation based on a d ynamically selected subset of input tokens. These nuggets are learned thro ugh tasks like autoencoding and machine translation\, and intuitively segm ent language into meaningful units. We demonstrate Nugget outperforms rela ted approaches in tasks involving semantic comparison. Finally\, we illust rate these compact units allow for expanding the contextual window of a la nguage model (LM)\, suggesting new future LMs that can condition on signif icantly larger amounts of content.
DTSTART;TZID=America/New_York:20230911T120000 DTEND;TZID=America/New_York:20230911T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Guanghui Qin “Nugget: Neural Agglomerative Embedd ings of Text (ICML 2023)” URL:https://www.clsp.jhu.edu/events/student-seminar-guanghui-qin/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Qin\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23892@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThe growing power in compu ting and AI promises a near-term future of human-machine teamwork. In this talk\, I will present my research group’s efforts in understanding the co mplex dynamics of human-machine interaction and designing intelligent mach ines aimed to assist and collaborate with people. I will focus on 1) tools for onboarding machine teammates and authoring machine assistance\, 2) me thods for detecting\, and broadly managing\, errors in collaboration\, and 3) building blocks of knowledge needed to enable ad hoc human-machine tea mwork. I will also highlight our recent work on designing assistive\, coll aborative machines to support older adults aging in place.
\nBiography
\nChien-Ming Huang is the John C. Malone Assista nt Professor in the Department of Computer Science at the Johns Hopkins Un iversity. His research focuses on designing interactive AI aimed to assist and collaborate with people. He publishes in top-tier venues in HRI\, HCI \, and robotics including Science Robotics\, HRI\, CHI\, and CSCW. His res earch has received media coverage from MIT Technology Review\, Tech Inside r\, and Science Nation. Huang completed his postdoctoral training at Yale University and received his Ph.D. in Computer Science at the University of Wisconsin–Madison. He is a recipient of the NSF CAREER award. https://www.cs.jhu.edu/~cmhuang/
DTSTART;TZID=America/New_York:20230915T120000 DTEND;TZID=America/New_York:20230915T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Chien-Ming Huang (Johns Hopkins University) “Becoming Teammates: De signing Assistive\, Collaborative Machines” URL:https://www.clsp.jhu.edu/events/chien-ming-huang-johns-hopkins-universi ty/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Huang\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23894@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThe use of NLP in the real m of financial technology is broad and complex\, with applications ranging from sentiment analysis and named entity recognition to question answerin g. Large Language Models (LLMs) have been shown to be effective on a varie ty of tasks\; however\, no LLM specialized for the financial domain has be en reported in the literature. In this work\, we present BloombergGPT\, a 50 billion parameter language model that is trained on a wide range of fin ancial data. We construct a 363 billion token dataset based on Bloomberg’s extensive data sources\, perhaps the largest domain-specific dataset yet\ , augmented with 345 billion tokens from general-purpose datasets. We val idate BloombergGPT on standard LLM benchmarks\, open financial benchmarks\ , and a suite of internal benchmarks that most accurately reflect our inte nded usage. Our mixed dataset training leads to a model that outperforms e xisting models on financial tasks by significant margins without sacrifici ng performance on general LLM benchmarks. Additionally\, we explain our mo deling choices\, training process\, and evaluation methodology.
\nBiography
Mark Dredze is the John C Malone Professo r of Computer Science at Johns Hopkins University and the Director of Rese arch (Foundations of AI) for the JHU AI-X Foundry. He develops Artificial Intelligence Systems based on natural language processing and explores app lications to public health and medicine.
\nProf. Dredze is affiliate d with the Malone Center for Engineering in Healthcare\, the Center for La nguage and Speech Processing\, among others. He holds a joint appointment in the Biomedical Informatics & Data Science Section (BIDS)\, under the Depart ment of Medicine (DOM)\, Division of General Internal Medicine (GIM) in th e School of Medicine. He obtained his PhD from the University of Pennsylva nia in 2009.
DTSTART;TZID=America/New_York:20230918T120000 DTEND;TZID=America/New_York:20230918T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Mark Dredze (Johns Hopkins University) “BloombergGPT: A Large Langu age Model for Finance” URL:https://www.clsp.jhu.edu/events/mark-dredze-johns-hopkins-university/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Dredze\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23983@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nVisually rich documents (scanned or digital) remain important for man y consumer and business use cases. During this talk we will share recent work from our team in the Document In telligence Lab of Adobe Research to understand\, create\, and interact wit h these documents. First\, we’ll share a series of work on building model s to decompose and understand the structure of documents to support use ca ses around document analysis and accessibility. Next\, we’ll explore docum ent semantic understanding for a project where we convert natural language contract clauses to code to support business automation. Finally\, we’ll discuss DocEdit\, a model and dataset that enables editing structured docu ments from natural language.
\nBIOS:
\n< p>Rajiv Jain is a Senior R esearch Scientist in the Document Intelligence Lab in Adobe Research\, whe re his research focuses on understanding the layout\, content\, and intera ction with documents. Prior to joining Adobe\, Rajiv was a consultant at D ARPA\, where he worked on the Media Forensics Program to secure digital im agery. He previously served for 10 years as a researcher for the Departmen t of Defense where he worked on projects around large scale systems\, comp uter vision\, and network security. He received his PhD in computer scienc e from the University of Maryland\, College Park working in the field of d ocument image analysis and retrieval.\nChris Tensmeyer primarily focuses on multi-moda l document layout and content understanding as a Research Scientist in the Document Intelligence Lab of Adobe Research. Since joining Adobe 5 years ago\, his work has directly impacted popular Adobe features such as mobil e Acrobat Liquid Mode\, PDF table extraction\, handwriting recognition\, a nd scanned document detection. Other research interests include general C omputer Vision and Deep Learning. He received his PhD in Computer Science from Brigham Young University on the topic of Deep Learning for Document Image Analysis.
DTSTART;TZID=America/New_York:20230922T120000 DTEND;TZID=America/New_York:20230922T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rajiv Jain and Chris Tensmeyer (Adobe) “Document Intelligence at Ad obe Research” URL:https://www.clsp.jhu.edu/events/rajiv-jain-and-chris-tensmeyer-adobe-do cument-intelligence-at-adobe-research/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Jain and Tensmeyer\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23896@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThe field of NLP is in the midst of a disruptive shift\, fueled m ost recently by the advent of large language models (LLMs)\, with impacts on our methodologies\, funding and public perception. While the core techn ologies and scope of real-world impact of our field may be changing (every thing is different!)\, many of the same key challenges faced since the inc eption of our field remain (nothing has changed). In this talk I’ll descri be recent work characterizing and tackling some of these challenges\, nota bly: data-efficient domain adaptation and lifelong learning. I will also a nchor discussion of cycles and shifts in the field by describing findings from a qualitative study of factors shaping the community over time\, incl uding culture\, incentives\, and infrastructure. Through these complementa ry lenses into the past\, present and future\, I aim to inspire shared hop e\, excitement and discussion.
\nBio
\n< p class='x_x_x_MsoNormal'>Emma Strubell is the Raj Reddy Assistant Professor in the Language Technologies Institu te in the School of Computer Science at Carnegie Mellon University\, and a Visiting Scientist at the Allen Institute for Artificial Intelligence. Pr eviously she held research scientist roles at Google and FAIR after earnin g her doctoral degree in 2019 from the University of Massachusetts Amherst . Her research lies at the intersection of natural language processing and machine learning\, with a focus on providing pragmatic solutions to pract itioners who wish to gain insights from natural language text via computat ion- and data-efficient AI. Her work has been recognized with a Madrona AI Impact Award\, best paper awards at ACL and EMNLP\, and cited in news out lets including the New York Times and Wall Street Journal. DTSTART;TZID=America/New_York:20230925T120000 DTEND;TZID=America/New_York:20230925T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Emma Strubell (Carnegie Mellon University) “Large Language Models: Everything’s Different and Nothing Has Changed” URL:https://www.clsp.jhu.edu/events/emma-strubell-carnegie-mellon-universit y/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,September\,Strubell END:VEVENT BEGIN:VEVENT UID:ai1ec-23898@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nAny valuable NLP dataset has traditionally been shipped with crow dsourced categorical labels. Instructions for collecting these labels are easy to communicate and the labels themselves are easy to annotate. Howeve r\, as self-supervision based methods are getting better at basically ever ything\, human annotations may need to provide more nuanced supervision or enable more detailed evaluation in order to be worth further collecting. One natural extension to existing categorical annotation schemes is to obt ain uncertainty information beyond a single hard label. In this talk\, I w ill discuss my recent efforts on introducing scalar labels in place of cat egorical labels as a form of uncertainty annotation. We demonstrate that\, compared to other more obvious annotation schemes for eliciting uncertain ty information\, scalar labels are significantly more cost-effective to an notate\, provide reliable evaluation\, and have a theoretical connection t o existing predictive uncertainty metrics. In particular\, they motivate u sing other losses as surrogates for calibration evaluation.
DTSTART;TZID=America/New_York:20230929T120000 DTEND;TZID=America/New_York:20230929T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:CLSP Student Seminar – Zhengping Jiang “Scalar Labels for Capturing Human Uncertainty” URL:https://www.clsp.jhu.edu/events/clsp-student-seminar-zhengping-jiang/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Jiang\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-24491@www.clsp.jhu.edu DTSTAMP:20240328T104618Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20240401T120000 DTEND;TZID=America/New_York:20240401T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Yuan Gong URL:https://www.clsp.jhu.edu/events/yuan-gong/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,April\,Gong END:VEVENT END:VCALENDAR