BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-20115@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nData science in small medi cal datasets usually means doing precision guesswork on unreliable data pr ovided by those with high expectations. The first part of this talk will f ocus on issues that data scientists and engineers have to address when wor king with this kind of data (e.g. unreliable labels\, the effect of confou nding factors\, necessity of clinical interpretability\, difficulties with fusing more data sets). The second part of the talk will include some rea l examples of this kind of data science in the field of neurology (predict ion of motor deficits in Parkinson’s disease based on acoustic analysis of speech\, diagnosis of Parkinson’s disease dysgraphia utilising online han dwriting\, exploring the Mozart effect in epilepsy based on the music info rmation retrieval) and psychology (assessment of graphomotor disabilities in children with developmental dysgraphia).
\nBiography
\nAbstract
\nOver the last few years\, deep neural models have taken over the field of natural language processin g (NLP)\, brandishing great improvements on many of its sequence-level tas ks. But the end-to-end nature of these models makes it hard to figure out whether the way they represent individual words aligns with how language b uilds itself from the bottom up\, or how lexical changes in register and d omain can affect the untested aspects of such representations.
\nIn this talk\, I will present NYTWIT\, a dataset created to challenge large l anguage models at the lexical level\, tasking them with identification of processes leading to the formation of novel English words\, as well as wit h segmentation and recovery of the specific subclass of novel blends. I wi ll then present XRayEmb\, a method which alleviates the hardships of proce ssing these novelties by fitting a character-level encoder to the existing models’ subword tokenizers\; and conclude with a discussion of the drawba cks of current tokenizers’ vocabulary creation schemes.
\nBi ography
\nYuval Pinter is a Senior Lecturer in the Department of Comp uter Science at Ben-Gurion University of the Negev\, focusing on natural l anguage processing. Yuval got his PhD at t he Georgia Institute of Technology School of Interactive Computing as a Bl oomberg Data Science PhD Fellow. Before that\, he worked as a Research Eng ineer at Yahoo Labs and as a Computational Linguist at Ginger Software\, a nd obtained an MA in Linguistics and a BSc in CS and Mathematics\, both fr om Tel Aviv University. Yuval blogs (in He brew) about language matters on Dagesh Kal.
DTSTART;TZID=America/New_York:20210910T120000 DTEND;TZID=America/New_York:20210910T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD SEQUENCE:0 SUMMARY:Yuval Pinter (Ben-Gurion University – Virtual Visit) “Challenging a nd Adapting NLP Models to Lexical Phenomena” URL:https://www.clsp.jhu.edu/events/yuval-pinter/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,Pinter\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-20723@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nText simplification aims t o help audiences read and understand a piece of text through lexical\, syn tactic\, and discourse modifications\, while remaining faithful to its cen tral idea and meaning. Thanks to large-scale parallel corpora derived from Wikipedia and News\, much of modern-day text simplification research focu ses on sentence simplification\, transforming original\, more complex sent ences into simplified versions. In this talk\, I present new frontiers tha t focus on discourse operations. First\, we consider the challenging task of simplifying highly technical language\, in our case\, medical texts. We introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinic al topics. We then propose a new metric to quantify stylistic differentiat es between the two\, and models for paragraph-level simplification. Second \, we present the first data-driven study of inserting elaborations and ex planations during simplification\, and illustrate the richness and complex ities of this phenomenon.
\nBiography
\nAbstract
\nRaytheon BBN participated in the IARPA MATERIAL program\, whose objective is to enable rapid develop ment of language-independent methods for cross-lingual information retriev al (CLIR). The challenging CLIR task of retrieving documents written (or s poken) in one language so that they satisfy an information need expressed in a different language is exacerbated by unique challenges posed by the M ATERIAL program: limited training data for automatic speech recognition an d machine translation\, scant lexical resources\, non-standardized orthogr aphy\, etc. Furthermore\, the format of the queries and the “Query-Weighte d Value” performance measure are non-standard and not previously studied i n the IR community. In this talk\, we will describe the Raytheon BBN CLIR system\, which was successful at addressing the above challenges and uniqu e characteristics of the program.
\nBiography
\nDamianos Karakos has been at Raytheon BBN f or the past nine years\, where he is currently a Senior Principal Engineer \, Research. Before that\, he was research faculty at Johns Hopkins Univer sity. He has worked on several Government projects (e.g.\, DARPA GALE\, DA RPA RATS\, IARPA BABEL\, IARPA MATERIAL\, IARPA BETTER) and on a variety o f HLT-related topics (e.g.\, speech recognition\, speech activity detectio n\, keyword search\, information retrieval). He has published more than 60 peer-reviewed papers. His research interests lie at the intersection of h uman language technology and machine learning\, with an emphasis on statis tical methods. He obtained a PhD in Electrical Engineering from the Univer sity of Maryland\, College Park\, in 2002.
\n\n
Abstract
\nAdversarial atta cks deceive neural network systems by adding carefully crafted perturbatio ns to benign signals. Being almost imperceptible to humans\, these attacks pose a severe security threat to the state-of-the-art speech and speaker recognition systems\, making it vital to propose countermeasures against t hem. In this talk\, we focus on 1) classification of a given adversarial a ttack into attack algorithm type\, threat model type\, and signal-to-adver sarial-noise ratios\, 2) developing a novel speech denoising solution to f urther improve the classification performance.
\nO ur proposed approach uses an x-vector network as a signature extractor to get embeddings\, which we call signatures. These signatures contain inform ation about the attack and can help classify different attack algorithms\, threat models\, and signal-to-adversarial-noise ratios. We demonstrate th e transferability of such signatures to other tasks. In particular\, a sig nature extractor trained to classify attacks against speaker identificatio n can also be used to classify attacks against speaker verification and sp eech recognition. We also show that signatures can be used to detect unkno wn attacks i.e. attacks not included during training. Lastly\, we propose to improve the signature extractor by making the job of the signature ext ractor easier by removing the clean signal from the adversarial example (w hich consists of clean signal+perturbation). We train our signature extrac tor using adversarial perturbation. At inference time\, we use a time-doma in denoiser to obtain adversarial perturbation from adversarial examples. Using our improved approach\, we show that common attacks in the literatur e (Fast Gradient Sign Method (FGSM)\, Projected Gradient Descent (PGD)\, C arlini-Wagner (CW) ) can be classified with accuracy as high as 96%. We al so detect unknown attacks with an equal error rate (EER) of about 9%\, whi ch is very promising.
DTSTART;TZID=America/New_York:20220304T120000 DTEND;TZID=America/New_York:20220304T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Sonal Joshi “Classify and Detect Adversarial Atta cks Against Speaker and Speech Recognition Systems” URL:https://www.clsp.jhu.edu/events/student-seminar-sonal-joshi/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Joshi\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-21615@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nDTSTART;TZID=America/New_York:20220311T120000 DTEND;TZID=America/New_York:20220311T131500 LOCATION:Virtual Seminar SEQUENCE:0 SUMMARY:Student Seminar – Anton Belyy “Systems for Human-AI Cooperation on Collecting Semantic Annotations” URL:https://www.clsp.jhu.edu/events/student-seminar-anton-belyy-systems-for -human-ai-cooperation-on-collecting-semantic-annotations/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Belyy\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-21621@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nSystems that support expre ssive\, situated natural language interactions are essential for expanding access to complex computing systems\, such as robots and databases\, to n on-experts. Reasoning and learning in such natural language interactions i s a challenging open problem. For example\, resolving sentence meaning req uires reasoning not only about word meaning\, but also about the interacti on context\, including the history of the interaction and the situated env ironment. In addition\, the sequential dynamics that arise between user an d system in and across interactions make learning from static data\, i.e.\ , supervised data\, both challenging and ineffective. However\, these same interaction dynamics result in ample opportunities for learning from impl icit and explicit feedback that arises naturally in the interaction. This lays the foundation for systems that continually learn\, improve\, and ada pt their language use through interaction\, without additional annotation effort. In this talk\, I will focus on these challenges and opportunities. First\, I will describe our work on modeling dependencies between languag e meaning and interaction context when mapping natural language in interac tion to executable code. In the second part of the talk\, I will describe our work on language understanding and generation in collaborative interac tions\, focusing on continual learning from explicit and implicit user fee dback.
\nBiography
\nAlane Suhr is a PhD Cand idate in the Department of Computer Science at Cornell University\, advis ed by Yoav Artzi. Her research spans natural language processing\, machine learning\, and computer vision\, with a focus on building systems that pa rticipate and continually learn in situated natural language interactions with human users. Alane’s work has been recognized by paper awards at ACL and NAACL\, and has been supported by fellowships and grants\, including a n NSF Graduate Research Fellowship\, a Facebook PhD Fellowship\, and resea rch awards from AI2\, ParlAI\, and AWS. Alane has also co-organized multip le workshops and tutorials appearing at NeurIPS\, EMNLP\, NAACL\, and ACL. Previously\, Alane received a BS in Computer Science and Engineering as a n Eminence Fellow at the Ohio State University.
DTSTART;TZID=America/New_York:20220314T120000 DTEND;TZID=America/New_York:20220314T131500 LOCATION:Virtual Seminar SEQUENCE:0 SUMMARY:Alane Suhr (Cornell University) “Reasoning and Learning in Interact ive Natural Language Systems” URL:https://www.clsp.jhu.edu/events/alane-suhr-cornell-university-reasoning -and-learning-in-interactive-natural-language-systems/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,March\,Suhr END:VEVENT BEGIN:VEVENT UID:ai1ec-21616@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nSocial media allows resear chers to track societal and cultural changes over time based on language a nalysis tools. Many of these tools rely on statistical algorithms which ne ed to be tuned to specific types of language. Recent studies have shown th e absence of appropriate tuning\, specifically in the presence of semantic shift\, can hinder robustness of the underlying methods. However\, little is known about the practical effect this sensitivity may have on downstre am longitudinal analyses. We explore this gap in the literature through a timely case study: understanding shifts in depression during the course of the COVID-19 pandemic. We find that inclusion of only a small number of s emantically-unstable features can promote significant changes in longitudi nal estimates of our target outcome. At the same time\, we demonstrate tha t a recently-introduced method for measuring semantic shift may be used to proactively identify failure points of language-based models and\, in tur n\, improve predictive generalization.
DTSTART;TZID=America/New_York:20220318T120000 DTEND;TZID=America/New_York:20220318T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Keith Harrigian “The Problem of Semantic Shift in Longitudinal Monitoring of Social Media” URL:https://www.clsp.jhu.edu/events/student-seminar-keith-harrigian-the-pro blem-of-semantic-shift-in-longitudinal-monitoring-of-social-media/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Harrigian\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-21497@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nWhile the “deep learning t sunami” continues to define the state of the art in speech and language pr ocessing\, finite-state transducer grammars developed by linguists and eng ineers are still widely used in industrial\, highly-multilingual settings\ , particularly for symbolic\, “front-end” speech applications. In this tal k\, I will first briefly review the current state of the OpenFst and OpenG rm finite-state transducer libraries. I then review two “late-breaking” al gorithms found in these libraries. The first is a heuristic but highly-eff ective general-purpose optimization routine for weighted transducers. The second is an algorithm for computing the single shortest string of non-det erministic weighted acceptors which lack certain properties required by cl assic shortest-path algorithms. I will then illustrate how the OpenGrm too ls can be used to induce a finite-state string-to-string transduction mode l known as a pair n-gram model. This model has been applied to grapheme-to -phoneme conversion\, loanword detection\, abbreviation expansion\, and ba ck-transliteration\, among other tasks.
\nBiography
\nKyle Gorman is an assistant professor of linguistics at the Gradu ate Center\, City University of New York\, and director of the master’s pr ogram in computational linguistics\; he is also a software engineer in the speech and language algorithms group at Google. With Richard Sproat\, he is the coauthor of Finite-State Text Processing (Morgan & Claypool\ , 2021) and the creator of Pynini\, a finite-state text processing library for Python. He has also published on statistical methods for comparing co mputational models\, text normalization\, grapheme-to-phoneme conversion\, and morphological analysis\, as well as many topics in linguistic theory.
DTSTART;TZID=America/New_York:20220401T120000 DTEND;TZID=America/New_York:20220401T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Kyle Gorman (City University of New York) ” Weighted Finite-State T ransducers: The Later Years” URL:https://www.clsp.jhu.edu/events/kyle-gorman-city-university-of-new-york -weighted-finite-state-transducers-the-later-years/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Gorman\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-22374@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nIn recent years\, the fiel d of Natural Language Processing has seen a profusion of tasks\, datasets\ , and systems that facilitate reasoning about real-world situations throug h language (e.g.\, RTE\, MNLI\, COMET). Such systems might\, for example\, be trained to consider a situation where “somebody dropped a glass on the floor\,” and conclude it is likely that “the glass shattered” as a result . In this talk\, I will discuss three pieces of work that revisit assumpti ons made by or about these systems. In the first work\, I develop a Defeas ible Inference task\, which enables a system to recognize when a prior ass umption it has made may no longer be true in light of new evidence it rece ives. The second work I will discuss revisits partial-input baselines\, wh ich have highlighted issues of spurious correlations in natural language r easoning datasets and led to unfavorable assumptions about models’ reasoni ng abilities. In particular\, I will discuss experiments that show models may still learn to reason in the presence of spurious dataset artifacts. F inally\, I will touch on work analyzing harmful assumptions made by reason ing models in the form of social stereotypes\, particularly in the case of free-form generative reasoning models.
\nBiography
\nRachel Rudinger is an Assistant Professor in the Department of Co mputer Science at the University of Maryland\, College Park. She holds joi nt appointments in the Department of Linguistics and the Institute for Adv anced Computer Studies (UMIACS). In 2019\, Rachel completed her Ph.D. in C omputer Science at Johns Hopkins University in the Center for Language and Speech Processing. From 2019-2020\, she was a Young Investigator at the A llen Institute for AI in Seattle\, and a visiting researcher at the Univer sity of Washington. Her research interests include computational semantics \, common-sense reasoning\, and issues of social bias and fairness in NLP.
DTSTART;TZID=America/New_York:20220916T120000 DTEND;TZID=America/New_York:20220916T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rachel Rudinger (University of Maryland\, College Park) “Not So Fas t!: Revisiting Assumptions in (and about) Natural Language Reasoning” URL:https://www.clsp.jhu.edu/events/rachel-rudinger-university-of-maryland- college-park-not-so-fast-revisiting-assumptions-in-and-about-natural-langu age-reasoning/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Rudinger\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-22375@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nI will present our work on data augmentation using style transfer as a way to im prove domain adaptation in sequence labeling tasks. The target domain is s ocial media data\, and the task is named entity recognition (NER). The pre mise is that we can transform the labelled out of domain data into somethi ng that stylistically is more closely related to the target data. Then we can train a model on a combination of the generated data and the smaller a mount of in domain data to improve NER prediction performance. I will show recent empirical results on these efforts.
\nIf time allows\, I will also give an overview of other research projects I’m currently leading at RiTUAL (Research in Text Understanding and Analysis of Language) lab. The common thread among all these research problems is t he scarcity of labeled data.
\nBiography
\nThamar Solorio is a Professor of Com puter Science at the University of Houston (UH). She holds graduate degree s in Computer Science from the Instituto Nacional de Astrofísica\, Óptica y Electrónica\, in Puebla\, Mexico. Her research interests include informa tion extraction from social media data\, enabling technology for code-swit ched data\, stylistic modeling of text\, and more recently multimodal appr oaches for online content understanding. She is the director and founder o f the RiTUAL Lab at UH. She is the recipient of an NSF CAREER award for he r work on authorship attribution\, and recipient of the 2014 Emerging Lead er ABIE Award in Honor of Denice Denton. She is currently serving a second term as an elected board member of the North American Chapter of the Asso ciation of Computational Linguistics and was PC co-chair for NAACL 2019. S he recently joined the team of Editors in Chief for the ACL Rolling Review (ARR) system. Her research is currently funded by the NSF and by ADOBE. p> DTSTART;TZID=America/New_York:20220923T120000 DTEND;TZID=America/New_York:20220923T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Thamar Solorio (University of Houston) “Style Transfer for Data Aug mentation in Sequence Labeling Tasks” URL:https://www.clsp.jhu.edu/events/thamar-solorio-university-of-houston-st yle-transfer-for-data-augmentation-in-sequence-labeling-tasks/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,September\,Solorio END:VEVENT BEGIN:VEVENT UID:ai1ec-22380@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nThe availability of large multilingual pre-trained language models has opened up exciting pathways f or developing NLP technologies for languages with scarce resources. In thi s talk I will advocate for the need to go beyond the most common languages in multilingual evaluation\, and on the challenges of handling new\, unse en-during-training languages and varieties. I will also share some of my e xperiences with working with indigenous and other endangered language comm unities and activists.
\nBiography
\nAntonios Anastasopoulos is an As sistant Professor in Computer Science at George Mason University. In 2019\ , Antonis received his PhD in Computer Science from the University of Notr e Dame and then worked as a postdoctoral researcher at the Language Techno logies Institute at Carnegie Mellon University. His research interests rev olve around computational linguistics and natural language processing with a focus on low-resource settings\, endangered languages\, and cross-lingu al learning.
\nDTSTART;TZID=America/New_York:20220930T120000 DTEND;TZID=America/New_York:20220930T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Antonios Anastasopoulos (George Mason University) “NLP Beyond the T op-100 Languages” URL:https://www.clsp.jhu.edu/events/antonis-anastasopoulos-george-mason-uni versity/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Anastasopoulos\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23320@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nSpeech communications repr esents a core domain for education\, team problem solving\, social engagem ent\, and business interactions. The ability for Speech Technology to extr act layers of knowledge and assess engagement content represents the next generation of advanced speech solutions. Today\, the emergence of BIG DATA \, Machine Learning\, as well as voice enabled speech systems have require d the need for effective voice capture and automatic speech/speaker recogn ition. The ability to employ speech and language technology to assess huma n-to-human interactions offers new research paradigms having profound impa ct on assessing human interaction. In this talk\, we will focus on big dat a naturalistic audio processing relating to (i) child learning spaces\, an d (ii) the NASA APOLLO lunar missions. ML based technology advancements in clude automatic audio diarization\, speech recognition\, and speaker recog nition. Child-Teacher based assessment of conversational interactions are explored\, including keyword and “WH-word” (e.g.\, who\, what\, etc.). Dia rization processing solutions are applied to both classroom/learning space child speech\, as well as massive APOLLO data. CRSS-UTDallas is expanding our original Apollo-11 corpus\, resulting in a massive multi-track audio processing challenge to make available 150\,000hrs of Apollo mission data to be shared with science communities: (i) speech/language technology\, (i i) STEM/science and team-based researchers\, and (iii) education/historica l/archiving specialists. Our goals here are to provide resources which all ow to better understand how people work/learn collaboratively together. Fo r Apollo\, to accomplish one of mankind’s greatest scientific/technologica l challenges in the last century.
\nBiography
\nJohn H.L. Hansen\, received Ph.D. & M.S. degrees from Georgia Institute of Technology\, and B.S.E.E. from Rutgers Univ. He joined Univ. of Texas at Dallas (UTDallas) in 2005\, where he currently serves as Associate Dean for Research\, Prof. of ECE\, Distinguished Univ. Chair in Telecom. Engin eering\, and directs Center for Robust Speech Systems (CRSS). He is an ISC A Fellow\, IEEE Fellow\, and has served as Member and TC-Chair of IEEE Sig nal Proc. Society\, Speech & Language Proc. Tech. Comm.(SLTC)\, and Techni cal Advisor to U.S. Delegate for NATO (IST/TG-01). He served as ISCA Presi dent (2017-21)\, continues to serve on ISCA Board (2015-23) as Treasurer\, has supervised 99 PhD/MS thesis candidates (EE\,CE\,BME\,TE\,CS\,Ling.\,C og.Sci.\,Spch.Sci.\,Hear.Sci)\, was recipient of 2020 UT-Dallas Provost’s Award for Grad. PhD Research Mentoring\; author/co-author of 865 journal/c onference papers including 14 textbooks in the field of speech/language/he aring processing & technology including coauthor of textbook Discrete-Time Processing of Speech Signals\, (IEEE Press\, 2000)\, and lead author of t he report “The Impact of Speech Under ‘Stress’ on Military Speech Technolo gy\,” (NATO RTO-TR-10\, 2000). He served as Organizer\, Chair/Co-Chair/Tec h.Chair for ISCA INTERSPEECH-2022\, IEEE ICASSP-2010\, IEEE SLT-2014\, ISC A INTERSPEECH-2002\, and Tech. Chair for IEEE ICASSP-2024. He received the 2022 IEEE Signal Processing Society Leo Beranek MERITORIOUS SERVICE Award .
\nDTSTART;TZID=America/New_York:20230303T120000 DTEND;TZID=America/New_York:20230303T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:John Hansen (University of Texas at Dallas) “Challenges and Advance ments in Speaker Diarization & Recognition for Naturalistic Data Streams” URL:https://www.clsp.jhu.edu/events/john-hansen-university-of-texas-at-dall as/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Hansen\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-23439@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nAs data-based technologies proliferate\, it is increasingly important for researchers to be aware of their work’s wider impact. Concerns like navigating the IRB and figuring out copyright and licensing issues are still key\, but the current focus s hift to matters like inclusivity\, fairness\, and transparency and their i mpact on the research/development life cycle have added complexity to the research task. In this talk\, we will take a broad look at the various way s ethics intersects with natural language processing\, machine learning\, and artificial intelligence research and discuss strategies and resources for managing these concerns within the broader research framework.
\nBiography
\nDenise is responsible for the overall operation of LDC’s External Relations group which includes intellectual pr operty management\, licensing\, regulatory matters\, publications\, member ship and communications. Before joining LDC\, she practiced law for over 2 0 years in the areas of international trade\, intellectual property and co mmercial litigation. She has an A.B. in Political Science from Bryn Mawr C ollege and a Juris Doctor degree from the University of Miami School of La w.
DTSTART;TZID=America/New_York:20230310T120000 DTEND;TZID=America/New_York:20230310T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street SEQUENCE:0 SUMMARY:Denise DiPersio (Linguistic Data Consortium\, University of Pennsyl vania) “Data and Ethics: Where Does the Twain Meet?” URL:https://www.clsp.jhu.edu/events/denise-dipersio-linguistic-data-consort ium-university-of-pennsylvania-data-and-ethics-where-does-the-twain-meet/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,DiPersio\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-23505@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nRecent advances in large pretrained language models have unlocked new exciting a pplications for Natural Language Generation for creative tasks\, such as l yrics or humour generation. In this talk we will discuss recent works by o ur team at Alexa AI and discuss current challenges: (1) Pun understanding and generation: We release new datasets for pun understanding and the nove l task of context-situated pun generation\, and demonstrate the value of o ur annotations for pun classification and generation tasks. (2) Song lyric generation: we design a hierarchical lyric generation framework that enab les us to generate pleasantly-singable lyrics without training on melody-l yric aligned data\, and show that our approach is competitive with strong baselines supervised on parallel data. (3) Create with Alexa: a multimodal story creation experience recently launched on Alexa devices\, which leve rages story text generation models in tandem with story visualization and background music generation models to produce multimodal stories for kids.
\nBiography
\nAlessandra Cervone is an Appli ed Scientist in the Natural Understanding team at Amazon Alexa AI. Alessan dra holds an MSc in Speech and Language Processing from University of Edin burgh and a PhD in CS from University of Trento (Italy). During her PhD\, Alessandra worked on computational models of coherence in open-domain dial ogue advised by Giuseppe Riccardi. In the first year of the PhD\, she was the team leader of one of the teams selected to compete in the first editi on of the Alexa Prize. More recently\, her research interests have been fo cused on natural language generation and its evaluation\, in particular in the context of creative AI applications.
\nDTSTART;TZID=America/New_York:20230317T120000 DTEND;TZID=America/New_York:20230317T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Alessandra Cervone (Amazon) “Controllable Text Generation for Creat ive Applications URL:https://www.clsp.jhu.edu/events/alexxandra-cervone-amazon/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Cervone\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-23555@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20230327T120000 DTEND;TZID=America/New_York:20230327T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Desh Raj URL:https://www.clsp.jhu.edu/events/student-seminar-desh-raj-2/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,March\,Raj END:VEVENT BEGIN:VEVENT UID:ai1ec-23513@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nDespite many recent advanc es in automatic speech recognition (ASR)\, linguists and language communit ies engaged in language documentation projects continue to face the obstac le of the “transcription bottleneck”. Researchers in NLP typically do not distinguish between widely spoken languages that currently happen to have few training resources and endangered languages that will never have abund ant data. As a result\, we often fail to thoroughly explore when ASR is he lpful for language documentation\, what architectures work best for the so rts of languages that are in need of documentation\, and how data can be c ollected and organized to produce optimal results. In this talk I describe several projects that attempt to bridge the gap between the promise of AS R for language documentation and the reality of using this technology in r eal-world settings.
\nBiography
\nAbstract
\nLarge language models (LLM s) have demonstrated incredible power\, but they also possess vulnerabilit ies that can lead to misuse and potential attacks. In this presentation\, we will address two fundamental questions regarding the responsible utiliz ation of LLMs: (1) How can we accurately identify AI-generated text? (2) W hat measures can safeguard the intellectual property of LLMs? We will intr oduce two recent watermarking techniques designed for text and models\, re spectively. Our discussion will encompass the theoretical underpinnings th at ensure the correctness of watermark detection\, along with robustness a gainst evasion attacks. Furthermore\, we will showcase empirical evidence validating their effectiveness. These findings establish a solid technical groundwork for policymakers\, legal professionals\, and generative AI pra ctitioners alike.
\nBiography
\nLei Li is an Assistant Professor in Language Technology Institute at Carnegie Mellon Un iversity. He received Ph.D. from Carnegie Mellon University School of Comp uter Science. He is a recipient of ACL 2021 Best Paper Award\, CCF Young E lite Award in 2019\, CCF distinguished speaker in 2017\, Wu Wen-tsün AI pr ize in 2017\, and 2012 ACM SIGKDD dissertation award (runner-up)\, and is recognized as Notable Area Chair of ICLR 2023. Previously\, he was a facul ty member at UC Santa Barbara. Prior to that\, he founded ByteDance AI La b in 2016 and led its research in NLP\, ML\, Robotics\, and Drug Discovery . He launched ByteDance’s machine translation system VolcTrans and AI writ ing system Xiaomingbot\, serving one billion users.
DTSTART;TZID=America/New_York:20230901T120000 DTEND;TZID=America/New_York:20230901T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Lei Li (Carnegie Mellon University) “Empowering Responsible Use of Large Language Models” URL:https://www.clsp.jhu.edu/events/lei-li-carnegie-mellon-university-empow ering-responsible-use-of-large-language-models/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Li\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23886@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThe arms race to build inc reasingly larger\, powerful language models (LMs) in the past year has bee n remarkable. Yet incorporating LMs effectively into practical application s that facilitate manual workflows remains challenging. I will discuss LMs ’ limiting factors and our efforts to overcome them. I will start with cha llenges surrounding efficient and robust LM alignment. I will share insigh ts from our recent paper “Sel f-Instruct” (ACL 2023)\, where we used vanilla (unaligned) LMs for ali gning itself\, an approach that has yielded some success. Then\, I will mo ve on to the challenge of tracing the output of LMs to reliable sources\, a weakness that makes them prone to hallucinations. I will discuss our rec ent approach of ‘according-to’ prom pting\, which steers LMs to quote directly from sources observed in it s pre-training. If time permits\, I will discuss our ongoing project to ad apt LMs to interact with web pages. Throughout the presentation\, I will h ighlight our progress\, and end with questions about our future progress.< /p>\n
Biography
\nDaniel Khashabi is an assistant professor in computer science at Johns Hopkins University and the Center for Language and Speech Processing (CLSP) member. He is interested in bui lding reasoning-driven modular NLP systems that are robust\, transparent\, and communicative\, particularly those that use natural language as the c ommunication medium. Khashabi has published over 40 papers on natural lang uage processing and AI in top-tier venues. His work touches upon developin g. His research has won the ACL 2023 Outstanding Paper Award\, NAACL 2022 Best Paper Award\, research gifts from the Allen Institute for AI\, and an Amazon Research Award 2023. Before joining Hopkins\, he was a postdoctora l fellow at the Allen Institute for AI (2019-2022) and obtained a Ph.D. fr om the University of Pennsylvania in 2019.
DTSTART;TZID=America/New_York:20230908T120000 DTEND;TZID=America/New_York:20230908T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Daniel Khashabi (Johns Hopkins University) “Building More Helpful L anguage Models” URL:https://www.clsp.jhu.edu/events/daniel-khashabi-johns-hopkins-universit y/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Khashabi\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23888@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nEmbedding text sequences is a widespread requirement in modern lan guage understanding. Existing approaches focus largely on constant-size re presentations. This is problematic\, as the amount of information containe d in text often varies with the length of the input. We propose a solution called Nugget\, which encodes language into a representation based on a d ynamically selected subset of input tokens. These nuggets are learned thro ugh tasks like autoencoding and machine translation\, and intuitively segm ent language into meaningful units. We demonstrate Nugget outperforms rela ted approaches in tasks involving semantic comparison. Finally\, we illust rate these compact units allow for expanding the contextual window of a la nguage model (LM)\, suggesting new future LMs that can condition on signif icantly larger amounts of content.
DTSTART;TZID=America/New_York:20230911T120000 DTEND;TZID=America/New_York:20230911T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Guanghui Qin “Nugget: Neural Agglomerative Embedd ings of Text (ICML 2023)” URL:https://www.clsp.jhu.edu/events/student-seminar-guanghui-qin/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Qin\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23892@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThe growing power in compu ting and AI promises a near-term future of human-machine teamwork. In this talk\, I will present my research group’s efforts in understanding the co mplex dynamics of human-machine interaction and designing intelligent mach ines aimed to assist and collaborate with people. I will focus on 1) tools for onboarding machine teammates and authoring machine assistance\, 2) me thods for detecting\, and broadly managing\, errors in collaboration\, and 3) building blocks of knowledge needed to enable ad hoc human-machine tea mwork. I will also highlight our recent work on designing assistive\, coll aborative machines to support older adults aging in place.
\nBiography
\nChien-Ming Huang is the John C. Malone Assista nt Professor in the Department of Computer Science at the Johns Hopkins Un iversity. His research focuses on designing interactive AI aimed to assist and collaborate with people. He publishes in top-tier venues in HRI\, HCI \, and robotics including Science Robotics\, HRI\, CHI\, and CSCW. His res earch has received media coverage from MIT Technology Review\, Tech Inside r\, and Science Nation. Huang completed his postdoctoral training at Yale University and received his Ph.D. in Computer Science at the University of Wisconsin–Madison. He is a recipient of the NSF CAREER award. https://www.cs.jhu.edu/~cmhuang/
DTSTART;TZID=America/New_York:20230915T120000 DTEND;TZID=America/New_York:20230915T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Chien-Ming Huang (Johns Hopkins University) “Becoming Teammates: De signing Assistive\, Collaborative Machines” URL:https://www.clsp.jhu.edu/events/chien-ming-huang-johns-hopkins-universi ty/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Huang\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23894@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThe use of NLP in the real m of financial technology is broad and complex\, with applications ranging from sentiment analysis and named entity recognition to question answerin g. Large Language Models (LLMs) have been shown to be effective on a varie ty of tasks\; however\, no LLM specialized for the financial domain has be en reported in the literature. In this work\, we present BloombergGPT\, a 50 billion parameter language model that is trained on a wide range of fin ancial data. We construct a 363 billion token dataset based on Bloomberg’s extensive data sources\, perhaps the largest domain-specific dataset yet\ , augmented with 345 billion tokens from general-purpose datasets. We val idate BloombergGPT on standard LLM benchmarks\, open financial benchmarks\ , and a suite of internal benchmarks that most accurately reflect our inte nded usage. Our mixed dataset training leads to a model that outperforms e xisting models on financial tasks by significant margins without sacrifici ng performance on general LLM benchmarks. Additionally\, we explain our mo deling choices\, training process\, and evaluation methodology.
\nBiography
Mark Dredze is the John C Malone Professo r of Computer Science at Johns Hopkins University and the Director of Rese arch (Foundations of AI) for the JHU AI-X Foundry. He develops Artificial Intelligence Systems based on natural language processing and explores app lications to public health and medicine.
\nProf. Dredze is affiliate d with the Malone Center for Engineering in Healthcare\, the Center for La nguage and Speech Processing\, among others. He holds a joint appointment in the Biomedical Informatics & Data Science Section (BIDS)\, under the Depart ment of Medicine (DOM)\, Division of General Internal Medicine (GIM) in th e School of Medicine. He obtained his PhD from the University of Pennsylva nia in 2009.
DTSTART;TZID=America/New_York:20230918T120000 DTEND;TZID=America/New_York:20230918T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Mark Dredze (Johns Hopkins University) “BloombergGPT: A Large Langu age Model for Finance” URL:https://www.clsp.jhu.edu/events/mark-dredze-johns-hopkins-university/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Dredze\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23983@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nVisually rich documents (scanned or digital) remain important for man y consumer and business use cases. During this talk we will share recent work from our team in the Document In telligence Lab of Adobe Research to understand\, create\, and interact wit h these documents. First\, we’ll share a series of work on building model s to decompose and understand the structure of documents to support use ca ses around document analysis and accessibility. Next\, we’ll explore docum ent semantic understanding for a project where we convert natural language contract clauses to code to support business automation. Finally\, we’ll discuss DocEdit\, a model and dataset that enables editing structured docu ments from natural language.
\nBIOS:
\n< p>Rajiv Jain is a Senior R esearch Scientist in the Document Intelligence Lab in Adobe Research\, whe re his research focuses on understanding the layout\, content\, and intera ction with documents. Prior to joining Adobe\, Rajiv was a consultant at D ARPA\, where he worked on the Media Forensics Program to secure digital im agery. He previously served for 10 years as a researcher for the Departmen t of Defense where he worked on projects around large scale systems\, comp uter vision\, and network security. He received his PhD in computer scienc e from the University of Maryland\, College Park working in the field of d ocument image analysis and retrieval.\nChris Tensmeyer primarily focuses on multi-moda l document layout and content understanding as a Research Scientist in the Document Intelligence Lab of Adobe Research. Since joining Adobe 5 years ago\, his work has directly impacted popular Adobe features such as mobil e Acrobat Liquid Mode\, PDF table extraction\, handwriting recognition\, a nd scanned document detection. Other research interests include general C omputer Vision and Deep Learning. He received his PhD in Computer Science from Brigham Young University on the topic of Deep Learning for Document Image Analysis.
DTSTART;TZID=America/New_York:20230922T120000 DTEND;TZID=America/New_York:20230922T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rajiv Jain and Chris Tensmeyer (Adobe) “Document Intelligence at Ad obe Research” URL:https://www.clsp.jhu.edu/events/rajiv-jain-and-chris-tensmeyer-adobe-do cument-intelligence-at-adobe-research/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Jain and Tensmeyer\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-23896@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nThe field of NLP is in the midst of a disruptive shift\, fueled m ost recently by the advent of large language models (LLMs)\, with impacts on our methodologies\, funding and public perception. While the core techn ologies and scope of real-world impact of our field may be changing (every thing is different!)\, many of the same key challenges faced since the inc eption of our field remain (nothing has changed). In this talk I’ll descri be recent work characterizing and tackling some of these challenges\, nota bly: data-efficient domain adaptation and lifelong learning. I will also a nchor discussion of cycles and shifts in the field by describing findings from a qualitative study of factors shaping the community over time\, incl uding culture\, incentives\, and infrastructure. Through these complementa ry lenses into the past\, present and future\, I aim to inspire shared hop e\, excitement and discussion.
\nBio
\n< p class='x_x_x_MsoNormal'>Emma Strubell is the Raj Reddy Assistant Professor in the Language Technologies Institu te in the School of Computer Science at Carnegie Mellon University\, and a Visiting Scientist at the Allen Institute for Artificial Intelligence. Pr eviously she held research scientist roles at Google and FAIR after earnin g her doctoral degree in 2019 from the University of Massachusetts Amherst . Her research lies at the intersection of natural language processing and machine learning\, with a focus on providing pragmatic solutions to pract itioners who wish to gain insights from natural language text via computat ion- and data-efficient AI. Her work has been recognized with a Madrona AI Impact Award\, best paper awards at ACL and EMNLP\, and cited in news out lets including the New York Times and Wall Street Journal. DTSTART;TZID=America/New_York:20230925T120000 DTEND;TZID=America/New_York:20230925T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Emma Strubell (Carnegie Mellon University) “Large Language Models: Everything’s Different and Nothing Has Changed” URL:https://www.clsp.jhu.edu/events/emma-strubell-carnegie-mellon-universit y/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,September\,Strubell END:VEVENT BEGIN:VEVENT UID:ai1ec-23898@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nAny valuable NLP dataset has traditionally been shipped with crow dsourced categorical labels. Instructions for collecting these labels are easy to communicate and the labels themselves are easy to annotate. Howeve r\, as self-supervision based methods are getting better at basically ever ything\, human annotations may need to provide more nuanced supervision or enable more detailed evaluation in order to be worth further collecting. One natural extension to existing categorical annotation schemes is to obt ain uncertainty information beyond a single hard label. In this talk\, I w ill discuss my recent efforts on introducing scalar labels in place of cat egorical labels as a form of uncertainty annotation. We demonstrate that\, compared to other more obvious annotation schemes for eliciting uncertain ty information\, scalar labels are significantly more cost-effective to an notate\, provide reliable evaluation\, and have a theoretical connection t o existing predictive uncertainty metrics. In particular\, they motivate u sing other losses as surrogates for calibration evaluation.
DTSTART;TZID=America/New_York:20230929T120000 DTEND;TZID=America/New_York:20230929T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:CLSP Student Seminar – Zhengping Jiang “Scalar Labels for Capturing Human Uncertainty” URL:https://www.clsp.jhu.edu/events/clsp-student-seminar-zhengping-jiang/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Jiang\,September END:VEVENT BEGIN:VEVENT UID:ai1ec-24459@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20240301T120000 DTEND;TZID=America/New_York:20240301T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Mohit Iyyer “Improving\, Evaluating and Detecting Long-Form LLM-Gen erated Text” URL:https://www.clsp.jhu.edu/events/mohit-iyyer-improving-evaluating-and-de tecting-long-form-llm-generated-text/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,Iyyer\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-24461@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nMost machine translation s ystems operate on the sentence-level while humans write and translate with in a given context. Operating on individual sentences forces error-prone s entence segmentation into the machine translation pipeline. This limits th e upper-bound performance of these systems by creating noisy training bite xt. Further\, many grammatical features necessitate inter-sentential conte xt in order to translate which makes perfect sentence-level machine transl ation an impossible task. In this talk\, we will cover the inherent limits of sentence-level machine translation. Following this\, we will explore a key obstacle in the way of true context-aware machine translation—an abje ct lack of data. Finally\, we will cover recent work that provides (1) a new evaluation dataset that specifically addresses the translation of cont ext-dependent discourse phenomena and (2) reconstructed documents from lar ge-scale sentence-level bitext that can be used to improve performance whe n translating these types of phenomena.
DTSTART;TZID=America/New_York:20240304T120000 DTEND;TZID=America/New_York:20240304T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Rachel Wicks (JHU) “To Sentences and Beyond: Paving the Way for Con text-Aware Machine Translation” URL:https://www.clsp.jhu.edu/events/rachel-wicks-jhu/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,March\,Wicks END:VEVENT BEGIN:VEVENT UID:ai1ec-24465@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nLarge Language Models (LLM s) have demonstrated remarkable capabilities across various domains. Howev er\, it is still very challenging to build highly-reliable applications wi th LLMs that support specialized use cases. LLMs trained on web data often excel at capturing general language patterns\, but they could struggle to support specialized domains and personalized user needs. Moreover\, LLMs can produce errors that are deceptively plausible\, making them potentiall y dangerous for high-trust scenarios. In this talk\, I will discuss some o f our recent efforts in addressing these challenges with data-efficient tu ning methods and a novel factuality evaluation framework. Specifically\, m y talk will focus on building multilingual applications\, one crucial use case often characterized by limited tuning and evaluation data.
\nBio
Xinyi(Cindy) Wang is a research scientist at Go ogle DeepMind working on Large Language Models(LLM) and its application to generative question-answering. She has worked on multilingual instruction -tuning for Gemini and multilingual generative models used in Google searc h. Before Google DeepMind\, Cindy Wang obtained her PhD degree in Language Technologies at Carnegie Mellon University. During her PhD\, she mainly w orked on developing data-efficient natural language processing~(NLP) syste ms. She has made several contributions in data selection\, data representa tion\, and model adaptation for multilingual NLP.
DTSTART;TZID=America/New_York:20240308T120000 DTEND;TZID=America/New_York:20240308T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Cindy Wang (Google DeepMind) “Building Data-Efficient and Reliable Applications with Large Language Models” URL:https://www.clsp.jhu.edu/events/cindy-wang-google-deepmind-building-dat a-efficient-and-reliable-applications-with-large-language-models/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,March\,Wang END:VEVENT BEGIN:VEVENT UID:ai1ec-24479@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nT he speech field is evolving to solve more challenging scenarios\, such as multi-channel recordings with multiple simultaneous talkers. Given the man y types of microphone setups out there\, we present the UniX-Encoder. It’s a universal encoder designed for multiple tasks\, and worked with any mic rophone array\, in both solo and multi-talker environments. Our research e nhances previous multichannel speech processing efforts in four key areas: 1) Adaptability: Contrasting traditional models constrained to certain mi crophone array configurations\, our encoder is universally compatible. 2) MultiTask Capability: Beyond the single-task focus of previous systems\, U niX-Encoder acts as a robust upstream model\, adeptly extracting features for diverse tasks including ASR and speaker recognition. 3) Self-Supervise d Training: The encoder is trained without requiring labeled multi-channel data. 4) End-to-End Integration: In contrast to models that first beamfor m then process single-channels\, our encoder offers an end-to-end solution \, bypassing explicit beamforming or separation. To validate its effective ness\, we tested the UniXEncoder on a synthetic multi-channel dataset from the LibriSpeech corpus. Across tasks like speech recognition and speaker diarization\, our encoder consistently outperformed combinations like the WavLM model with the BeamformIt frontend.
DTSTART;TZID=America/New_York:20240311T200500 DTEND;TZID=America/New_York:20240311T210500 SEQUENCE:0 SUMMARY:Zili Huang (JHU) “Unix-Encoder: A Universal X-Channel Speech Encode r for Ad-Hoc Microphone Array Speech Processing” URL:https://www.clsp.jhu.edu/events/zili-huang-jhu-unix-encoder-a-universal -x-channel-speech-encoder-for-ad-hoc-microphone-array-speech-processing/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,Huang\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-24481@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nNatural language provides an intuitive and powerful interface to access knowledge at scale. Modern l anguage systems draw information from two rich knowledge sources: (1) info rmation stored in their parameters during massive pretraining and (2) docu ments retrieved at inference time. Yet\, we are far from building systems that can reliably provide information from such knowledge sources. In this talk\, I will discuss paths for more robust systems. In the first part of the talk\, I will present a module for scaling retrieval-based knowledge augmentation. We learn a compressor that maps retrieved documents into tex tual summaries prior to in-context integration. This not only reduces the computational costs but also filters irrelevant or incorrect information. In the second half of the talk\, I will discuss the challenges of updating knowledge stored in model parameters and propose a method to prevent mode ls from reciting outdated information by identifying facts that are prone to rapid change. I will conclude my talk by proposing an interactive syste m that can elicit information from users when needed.
\nBiog raphy
\nEunsol Choi is an assistant pro fessor in the Computer Science department at the University of Texas at Au stin. Prior to UT\, she spent a year at Google AI as a visiting researcher . Her research area spans natural language processing and machine learning . She is particularly interested in interpreting and reasoning about text in a dynamic real world context. She is a recipient of a Facebook research fellowship\, Google faculty research award\, Sony faculty award\, and an outstanding paper award at EMNLP. She received a Ph.D. in computer science and engineering from University of Washington and B.A in mathematics and computer science from Cornell University.
\nDTSTART;TZID=America/New_York:20240315T120000 DTEND;TZID=America/New_York:20240315T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21209 SEQUENCE:0 SUMMARY:Eunsol Choi (University of Texas at Austin) “Knowledge-Rich Languag e Systems in a Dynamic World” URL:https://www.clsp.jhu.edu/events/eunsol-choi-university-of-texas-at-aust in-knowledge-rich-language-systems-in-a-dynamic-world/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,Choi\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-24489@www.clsp.jhu.edu DTSTAMP:20240329T132713Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nOver the past decade\, the field of Speech Generation has seen significant progress in enhancing spe ech quality and naturalness. Despite these advancements\, persistent chall enges such as speech noise\, limited high-quality data availability\, and the lack of robustness in speech generation systems remain. Additionally\, the evaluation of speech presents a significant obstacle for comprehensiv e assessment at scale. Concurrently\, recent breakthroughs in Large Langua ge Models (LLMs) have revolutionized text generation and natural language processing. However\, the complexity of spoken language introduces unique hurdles\, including managing long speech waveform sequences. In this prese ntation\, I will explore recent innovations in speech synthesis with spoke n language modeling\, evaluation for generative speech systems and high-fi delity speech enhancement. Finally\, I will discuss prospective avenues fo r future research aimed at addressing these challenges.
\nBi o
\nSoumi Maiti is a postdoctoral researcher at Language Te chnologies Institute\, Carnegie Mellon University\, where she works on spe ech and language processing. Her research broadly focuses on building inte lligent systems that can communicate with humans naturally. She earned a Ph.D. from the Graduate Center\, City University of New York (CUNY) with t he Graduate Center Fellowship advised by Prof Michael Mandel. She earned h er B.Tech. in Computer Science from the Indian Institute of Engineering Sc ience and Technology\, Shibpur. Previously\, she has worked in the Text-To -Speech team at Apple. She has also worked at Google and Interactions LLC as a student researcher and research intern. She has worked as an adjunct lecturer at Brooklyn College\, CUNY\, for three years and served as a Math Fellow at Hunter College. She has served as session chair in ICASSP 2024\ , ICASSP 2023\, SLT 2023 and others\, and area chair at EMNLP 2023.
\n< p> DTSTART;TZID=America/New_York:20240329T120000 DTEND;TZID=America/New_York:20240329T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Soumi Maiti (CMU) “Towards Robust Speech Generation” URL:https://www.clsp.jhu.edu/events/soumi-maiti/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,Maiti\,March END:VEVENT END:VCALENDAR