BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-20987@www.clsp.jhu.edu DTSTAMP:20240328T192103Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nWhile there is a vast amou nt of text written about nearly any topic\, this is often difficult for so meone unfamiliar with a specific field to understand. Automated text simpl ification aims to reduce the complexity of a document\, making it more com prehensible to a broader audience. Much of the research in this field has traditionally focused on simplification sub-tasks\, such as lexical\, synt actic\, or sentence-level simplification. However\, current systems strugg le to consistently produce high-quality simplifications. Phrase-based mode ls tend to make too many poor transformations\; on the other hand\, recent neural models\, while producing grammatical output\, often do not make al l needed changes to the original text. In this thesis\, I discuss novel ap proaches for improving lexical and sentence-level simplification systems. Regarding sentence simplification models\, after noting that encouraging d iversity at inference time leads to significant improvements\, I take a cl oser look at the idea of diversity and perform an exhaustive comparison of diverse decoding techniques on other generation tasks. I also discuss the limitations in the framing of current simplification tasks\, which preven t these models from yet being practically useful. Thus\, I also propose a retrieval-based reformulation of the problem. Specifically\, starting with a document\, I identify concepts critical to understanding its content\, and then retrieve documents relevant for each concept\, re-ranking them ba sed on the desired complexity level.
\nBiography
\nI’m a research scientist at the HLTCOE at Johns Hopkins University. My primary research interests are in language generati on\, diverse and constrained decoding\, and information retrieval. During my PhD I focused mainly on the task of text simplification\, and now am wo rking on formulating structured prediction problems as end-to-end generati on tasks. I received my PhD in July 2021 from the University of Pennsylvan ia with Chris Callison-Burch and Marianna Apidianaki.
\nDTSTART;TZID=America/New_York:20211022T120000 DTEND;TZID=America/New_York:20211022T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Reno Kriz (HLTCOE – JHU) “Towards a Practically Useful Text Simplif ication System” URL:https://www.clsp.jhu.edu/events/reno-kriz-hltcoe-jhu-towards-a-practica lly-useful-text-simplification-system/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,Kriz\,October END:VEVENT BEGIN:VEVENT UID:ai1ec-21023@www.clsp.jhu.edu DTSTAMP:20240328T192103Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nSpeech data is notoriously difficult to work with due to a variety of codecs\, length s of recordings\, and meta-data formats. We present Lhotse\, a speech data representation library that draws upon lessons learned from Kaldi speech recognition toolkit and brings its concepts into the modern deep learning ecosystem. Lhotse provides a common JSON description format with correspon ding Python classes and data preparation recipes for over 30 popular speec h corpora. Various datasets can be easily combined together and re-purpose d for different tasks. The library handles multi-channel recordings\, long recordings\, local and cloud storage\, lazy and on-the-fly operations amo ngst other features. We introduce Cut and CutSet concepts\, which simplify common data wrangling tasks for audio and help incorporate acoustic conte xt of speech utterances. Finally\, we show how Lhotse leverages PyTorch da ta API abstractions and adopts them to handle speech data for deep learnin g.
\nBiography
\nPiotr Zelasko is an a ssistant research scientist in the Center for Language and Speech Processi ng (CLSP) who specializes in automatic speech recognition (ASR) and spoken language understanding (SLU). His current research focuses on applying mu ltilingual and crosslingual speech recognition systems to categorize the p honetic inventory of a previously unknown language and on improving defens es against adversarial attacks on both speaker identification and automati c speech recognition systems. He is also addressing the question of how to structure a spontaneous conversation into high-level semantic units such as dialog acts or topics. Finally\, he is working on Lhotse + K2\, the nex t-generation speech processing research software ecosystem. Before joining Johns Hopkins\, Zelasko worked as a machine learning consultant for Avaya (2017-2019)\, and as a machine learning engineer for Techmo (2015-2017). Zelasko received his PhD (2019) in electronics engineering\, as well as hi s master’s (2014) and undergraduate degrees (2013) in acoustic engineering from AGH University of Science and Technology in Kraków\, Poland.
DTSTART;TZID=America/New_York:20211029T120000 DTEND;TZID=America/New_York:20211029T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore MD 21218 SEQUENCE:0 SUMMARY:Piotr Zelasko (CLSP at JHU) “Lhotse: a speech data representation l ibrary for the modern deep learning ecosystem” URL:https://www.clsp.jhu.edu/events/piotr-zelasko-clsp-at-jhu-lhotse-a-spee ch-data-representation-library-for-the-modern-deep-learning-ecosystem/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,October\,Zelasko END:VEVENT BEGIN:VEVENT UID:ai1ec-22423@www.clsp.jhu.edu DTSTAMP:20240328T192103Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20221007T120000 DTEND;TZID=America/New_York:20221007T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Ariya Rastrow (Amazon) URL:https://www.clsp.jhu.edu/events/ariya-rastrow-amazon-2/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,October\,Rastrow END:VEVENT BEGIN:VEVENT UID:ai1ec-22394@www.clsp.jhu.edu DTSTAMP:20240328T192103Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nModel robustness and spurious correlations have received increasing atten tion in the NLP community\, both in methods and evaluation. The term “spur ious correlation” is overloaded though and can refer to any undesirable sh ortcuts learned by the model\, as judged by domain experts.
\nWhen designing mitigation algorithms\, we oft en (implicitly) assume that a spurious feature is irrelevant for predictio n. However\, many features in NLP (e.g. word overlap and negation) are not spurious in the sense that the background is spurious for classifying obj ects in an image. In contrast\, they carry important information that’s ne eded to make predictions by humans. In this talk\, we argue that it is mor e productive to characterize features in terms of their necessity and suff iciency for prediction. We then discuss the implications of this categoriz ation in representation\, learning\, and evaluation.
\nBiogr aphy
\nHe He is an Assistant Professor in the Department of Computer Science and the Center for Data Science at New York University. She obtained her PhD in Computer Science at the University of Maryland\, C ollege Park. Before joining NYU\, she spent a year at AWS AI and was a pos t-doc at Stanford University before that. She is interested in building ro bust and trustworthy NLP systems in human-centered settings. Her recent re search focus includes robust language understanding\, collaborative text g eneration\, and understanding capabilities and issues of large language mo dels.
\n DTSTART;TZID=America/New_York:20221014T120000 DTEND;TZID=America/New_York:20221014T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:He He (New York University) “What We Talk about When We Talk about Spurious Correlations in NLP” URL:https://www.clsp.jhu.edu/events/he-he-new-york-university/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,He\,October END:VEVENT BEGIN:VEVENT UID:ai1ec-22395@www.clsp.jhu.edu DTSTAMP:20240328T192103Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAbstract
\nModern learning architectures for natural language processing have been very suc cessful in incorporating a huge amount of texts into their parameters. How ever\, by and large\, such models store and use knowledge in distributed a nd decentralized ways. This proves unreliable and makes the models ill-sui ted for knowledge-intensive tasks that require reasoning over factual info rmation in linguistic expressions. In this talk\, I will give a few examp les of exploring alternative architectures to tackle those challenges. In particular\, we can improve the performance of such (language) models by r epresenting\, storing and accessing knowledge in a dedicated memory compon ent.
\nThis talk is based on several joint works with Yury Zemlyanskiy (Google Research)\, Michiel de Jong (USC and Google Research)\, William Cohen (Google Research and CMU) and our other collabo rators in Google Research.
\nBiography
\nFei is a research scientist at Google Research. Before that\, he was a Profess or of Computer Science at University of Southern California. His primary r esearch interests are machine learning and its application to various AI p roblems: speech and language processing\, computer vision\, robotics and r ecently weather forecast and climate modeling. He has a PhD (2007) from Computer and Information Science from U. of Pennsylvania and B.Sc and M.Sc in Biomedical Engineering from Southeast University (Nanjing\, China).
DTSTART;TZID=America/New_York:20221024T120000 DTEND;TZID=America/New_York:20221024T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Fei Sha (University of Southern California) “Extracting Information from Text into Memory for Knowledge-Intensive Tasks” URL:https://www.clsp.jhu.edu/events/fei-sha-university-of-southern-californ ia/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,October\,Sha END:VEVENT BEGIN:VEVENT UID:ai1ec-23900@www.clsp.jhu.edu DTSTAMP:20240328T192103Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20231002T120000 DTEND;TZID=America/New_York:20231002T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:CLSP Student Seminar – Anna Favaro URL:https://www.clsp.jhu.edu/events/clsp-student-seminar-anna-favaro/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Favaro\,October END:VEVENT BEGIN:VEVENT UID:ai1ec-24115@www.clsp.jhu.edu DTSTAMP:20240328T192103Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nOur goal is to use AI to a utomatically find tax minimization strategies\, an approach which we call “Shelter Check.” It would come in two variants. Existing-Authority Shelter Check would aim to find whether existing tax law authorities can be combi ned to create tax minimization strategies\, so the IRS or Congress can shu t them down. New-Authority Shelter Check would automate checking whether a new tax law authority – like proposed legislation or a draft court decisi on – would combine with existing authorities to create a new tax minimizat ion strategy. We had initially had high hopes for GPT-* large language mod els for implementing Shelter Check\, but our tests have showed that they d o very poorly at basic legal reasoning and handling legal text. So we are now creating a benchmark and training data for LLM’s handling legal text\, hoping to spur improvements.
DTSTART;TZID=America/New_York:20231006T120000 DTEND;TZID=America/New_York:20231006T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:CLSP Student Seminar – Andrew Blair-Stanek “Shelter Check and GPT-4 ’s Bad Legal Performance” URL:https://www.clsp.jhu.edu/events/clsp-student-seminar-andrew-blair-stane k-shelter-check-and-gpt-4s-bad-legal-performance/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Blair-Stanek\,October END:VEVENT BEGIN:VEVENT UID:ai1ec-24005@www.clsp.jhu.edu DTSTAMP:20240328T192103Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nLarge-scale generative models such as GPT and DALL-E have r evolutionized natural language processing and computer vision research. Th ese models not only generate high fidelity text or image outputs\, but als o demonstrate impressive domain and task generalization capabilities. In c ontrast\, audio generative models are relatively primitive in scale and ge neralization.
\nIn this talk\, I will start with a brief introduction on conventional n eural speech generative models and discuss why they are unfit for scaling to Internet-scale data. Next\, by reviewing the latest large-scale generat ive models for text and image\, I will outline a few lines of promising ap proaches to build scalable speech models. Last\, I will present Voicebox\, our latest work to advance this area. Voicebox is the most versatile gene rative model for speech. It is trained with a simple task — text condition ed speech infilling — on over 50K hours of multilingual speech with a powe rful flow-matching objective. Through in-context learning\, Voicebox can p erform monolingual/cross-lingual zero-shot TTS\, holistic style conversion \, transient noise removal\, content editing\, and diverse sample generati on. Moreover\, Voicebox achieves state-of-the-art performance and excellen t run-time efficiency.
\nBiography
\nWei-Ning Hsu is a resear ch scientist at Meta Foundational AI Research (FAIR) and currently the lea d of the audio generation team. His research focuses on self-supervised le arning and generative models for speech and audio. His pioneering work inc ludes HuBERT\, AV-HuBERT\, TextlessNLP\, data2vec\, wav2vec-U\, textless s peech translation\, and Voicebox.
\nPrior to joining Meta\, Wei-Ning worked at MERL an d Google Brain as a research intern. He received his Ph.D. and S.M. degree s in Electrical Engineering and Computer Science from Massachusetts Instit ute of Technology in 2020 and 2018\, under the supervision of Dr. James Gl ass. He received his B.S. degree in Electrical Engineering from National T aiwan University in 2014\, under the supervision of Prof. Lin-shan Lee and Prof. Hsuan-Tien Lin.
DTSTART;TZID=America/New_York:20231009T120000 DTEND;TZID=America/New_York:20231009T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Wei-Ning Hsu (Meta Foundational AI Research) “Large Scale Universal Speech Generative Models” URL:https://www.clsp.jhu.edu/events/wei-ning-hsu-meta-foundational-ai-resea rch-large-scale-universal-speech-generative-models/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Hsu\,October END:VEVENT BEGIN:VEVENT UID:ai1ec-23902@www.clsp.jhu.edu DTSTAMP:20240328T192103Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract
\nAbstract
\nRecent advances in speech technology make heavy use of pre-trained models that learn from large quan tities of raw (untranscribed) speech\, using “self-supervised” (ie unsuper vised) learning. These models learn to transform the acoustic input into a different representational format that makes supervised learning (for tas ks such as transcription or even translation) much easier. However\, *what * and *how* speech-relevant information is encoded in these representation s is not well understood. I will talk about some work at various stages of completion in which my group is analyzing the structure of these represen tations\, to gain a more systematic understanding of how word-level\, phon etic\, and speaker information is encoded.
\nBiography
\nSharon Goldwater is a Professor in the Institute for Language\ , Cognition and Computation at the University of Edinburgh’s School of Inf ormatics. She received her PhD in 2007 from Brown University and spent two years as a postdoctoral researcher at Stanford University before moving t o Edinburgh. Her research interests include unsupervised and minimally-sup ervised learning for speech and language processing\, computer modelling o f language acquisition in children\, and computational studies of language use. Her main focus within linguistics has been on the lower levels of s tructure including phonetics\, phonology\, and morphology.
DTSTART;TZID=America/New_York:20231027T120000 DTEND;TZID=America/New_York:20231027T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Sharon Goldwater (University of Edinburgh) “Analyzing Representatio ns of Self-Supervised Speech Models” URL:https://www.clsp.jhu.edu/events/sharon-goldwater-university-of-edinburg h/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Goldwater\,October END:VEVENT END:VCALENDAR Prof. Goldwater has received awards incl uding the 2016 Roger Needham Award from the British Computer Society for “ distinguished research contribution in computer science by a UK-based rese archer who has completed up to 10 years of post-doctoral research.” She ha s served on the editorial boards of several journals\, including Computati onal Linguistics\, Transactions of the Association for Computational Lingu istics\, and the inaugural board of OPEN MIND: Advances in Cognitive Scien ce. She was a program chair for the EACL 2014 Conference and chaired the E ACL governing board from 2019-2020.