Seminars

Feb
6
Mon
Sharon Levy (University of California, Santa Barbara) “Responsible AI via Responsible Large Language Models” @ Hackerman Hall B17
Feb 6 @ 12:00 pm – 1:15 pm

Abstract

While large language models have advanced the state-of-the-art in natural language processing, these models are trained on large-scale datasets, which may include harmful information. Studies have shown that as a result, the models exhibit social biases and generate misinformation after training. In this talk, I will discuss my work on analyzing and interpreting the risks of large language models across the areas of fairness, trustworthiness, and safety. I will first describe my research in the detection of dialect bias between African American English (AAE) vs. Standard American English (SAE). The second part investigates the trustworthiness of models through the memorization and subsequent generation of conspiracy theories. I will end my talk with recent work in AI safety regarding text that may lead to physical harm.

Biography

Sharon is a 5th-year Ph.D. candidate at the University of California, Santa Barbara, where she is advised by Professor William Wang. Her research interests lie in natural language processing, with a focus on Responsible AI. Sharon’s research spans the subareas of fairness, trustworthiness, and safety, with publications in ACL, EMNLP, WWW, and LREC. She has spent summers interning at AWS, Meta, and Pinterest. Sharon is a 2022 EECS Rising Star and a current recipient of the Amazon Alexa AI Fellowship for Responsible AI.

Mar
3
Fri
John Hansen (University of Texas at Dallas) “Challenges and Advancements in Speaker Diarization & Recognition for Naturalistic Data Streams” @ Hackerman Hall B17
Mar 3 @ 12:00 pm – 1:15 pm

Abstract

Speech communications represents a core domain for education, team problem solving, social engagement, and business interactions. The ability for Speech Technology to extract layers of knowledge and assess engagement content represents the next generation of advanced speech solutions. Today, the emergence of BIG DATA, Machine Learning, as well as voice enabled speech systems have required the need for effective voice capture and automatic speech/speaker recognition. The ability to employ speech and language technology to assess human-to-human interactions offers new research paradigms having profound impact on assessing human interaction. In this talk, we will focus on big data naturalistic audio processing relating to (i) child learning spaces, and (ii) the NASA APOLLO lunar missions. ML based technology advancements include automatic audio diarization, speech recognition, and speaker recognition. Child-Teacher based assessment of conversational interactions are explored, including keyword and “WH-word” (e.g., who, what, etc.). Diarization processing solutions are applied to both classroom/learning space child speech, as well as massive APOLLO data. CRSS-UTDallas is expanding our original Apollo-11 corpus, resulting in a massive multi-track audio processing challenge to make available 150,000hrs of Apollo mission data to be shared with science communities: (i) speech/language technology, (ii) STEM/science and team-based researchers, and (iii) education/historical/archiving specialists. Our goals here are to provide resources which allow to better understand how people work/learn collaboratively together. For Apollo, to accomplish one of mankind’s greatest scientific/technological challenges in the last century.

Biography

John H.L. Hansen, received Ph.D. & M.S. degrees from Georgia Institute of Technology, and B.S.E.E. from Rutgers Univ. He joined Univ. of Texas at Dallas (UTDallas) in 2005, where he currently serves as Associate Dean for Research, Prof. of ECE, Distinguished Univ. Chair in Telecom. Engineering, and directs Center for Robust Speech Systems (CRSS). He is an ISCA Fellow, IEEE Fellow, and has served as Member and TC-Chair of IEEE Signal Proc. Society, Speech & Language Proc. Tech. Comm.(SLTC), and Technical Advisor to U.S. Delegate for NATO (IST/TG-01). He served as ISCA President (2017-21), continues to serve on ISCA Board (2015-23) as Treasurer, has supervised 99 PhD/MS thesis candidates (EE,CE,BME,TE,CS,Ling.,Cog.Sci.,Spch.Sci.,Hear.Sci), was recipient of 2020 UT-Dallas Provost’s Award for Grad. PhD Research Mentoring; author/co-author of 865 journal/conference papers including 14 textbooks in the field of speech/language/hearing processing & technology including coauthor of textbook Discrete-Time Processing of Speech Signals, (IEEE Press, 2000), and lead author of the report “The Impact of Speech Under ‘Stress’ on Military Speech Technology,” (NATO RTO-TR-10, 2000). He served as Organizer, Chair/Co-Chair/Tech.Chair for ISCA INTERSPEECH-2022, IEEE ICASSP-2010, IEEE SLT-2014, ISCA INTERSPEECH-2002, and Tech. Chair for IEEE ICASSP-2024. He received the 2022 IEEE Signal Processing Society Leo Beranek MERITORIOUS SERVICE Award.

 

Mar
10
Fri
Denise DiPersio (Linguistic Data Consortium, University of Pennsylvania) “Data and Ethics: Where Does the Twain Meet?” @ Hackerman Hall B17
Mar 10 @ 12:00 pm – 1:15 pm

Abstract

As data-based technologies proliferate, it is increasingly important for researchers to be aware of their work’s wider impact. Concerns like navigating the IRB and figuring out copyright and licensing issues are still key, but the current focus shift to matters like inclusivity, fairness, and transparency and their impact on the research/development life cycle have added complexity to the research task. In this talk, we will take a broad look at the various ways ethics intersects with natural language processing, machine learning, and artificial intelligence research and discuss strategies and resources for managing these concerns within the broader research framework.

Biography

Denise is responsible for the overall operation of LDC’s External Relations group which includes intellectual property management, licensing, regulatory matters, publications, membership and communications. Before joining LDC, she practiced law for over 20 years in the areas of international trade, intellectual property and commercial litigation. She has an A.B. in Political Science from Bryn Mawr College and a Juris Doctor degree from the University of Miami School of Law.

Mar
17
Fri
Alessandra Cervone (Amazon) “Controllable Text Generation for Creative Applications @ Hackerman Hall B17
Mar 17 @ 12:00 pm – 1:15 pm

Abstract

Recent advances in large pretrained language models have unlocked new exciting applications for Natural Language Generation for creative tasks, such as lyrics or humour generation. In this talk we will discuss recent works by our team at Alexa AI and discuss current challenges: (1) Pun understanding and generation: We release new datasets for pun understanding and the novel task of context-situated pun generation, and demonstrate the value of our annotations for pun classification and generation tasks. (2) Song lyric generation: we design a hierarchical lyric generation framework that enables us to generate pleasantly-singable lyrics without training on melody-lyric aligned data, and show that our approach is competitive with strong baselines supervised on parallel data. (3) Create with Alexa: a multimodal story creation experience recently launched on Alexa devices, which leverages story text generation models in tandem with story visualization and background music generation models to produce multimodal stories for kids.

Biography

Alessandra Cervone is an Applied Scientist in the Natural Understanding team at Amazon Alexa AI. Alessandra holds an MSc in Speech and Language Processing from University of Edinburgh and a PhD in CS from University of Trento (Italy). During her PhD, Alessandra worked on computational models of coherence in open-domain dialogue advised by Giuseppe Riccardi. In the first year of the PhD, she was the team leader of one of the teams selected to compete in the first edition of the Alexa Prize. More recently, her research interests have been focused on natural language generation and its evaluation, in particular in the context of creative AI applications.

Mar
27
Mon
Student Seminar – Desh Raj @ Hackerman Hall B17
Mar 27 @ 12:00 pm – 1:15 pm
Mar
31
Fri
Emily Prud’hommeaux (Boston College) “Endangered or Just Under-Resourced? Evaluating ASR Quality and Utility When Data is Scarce” @ Hackerman Hall B17
Mar 31 @ 12:00 pm – 1:15 pm

Abstract

Despite many recent advances in automatic speech recognition (ASR), linguists and language communities engaged in language documentation projects continue to face the obstacle of the “transcription bottleneck”. Researchers in NLP typically do not distinguish between widely spoken languages that currently happen to have few training resources and endangered languages that will never have abundant data. As a result, we often fail to thoroughly explore when ASR is helpful for language documentation, what architectures work best for the sorts of languages that are in need of documentation, and how data can be collected and organized to produce optimal results. In this talk I describe several projects that attempt to bridge the gap between the promise of ASR for language documentation and the reality of using this technology in real-world settings.

Biography

Emily Prud’hommeaux is the Gianinno Family Sesquicentennial Assistant Professor in the Department of Computer Science at Boston College. She received her BA (Harvard) and MA (University of California, Los Angeles) in Linguistics, and her PhD in Computer Science and Engineering (OHSU/OGI). Her research area is natural language processing in low-resource settings, with a particular focus on endangered languages and the language of individuals with conditions impacting communication and cognition.
Sep
1
Fri
Lei Li (Carnegie Mellon University) “Empowering Responsible Use of Large Language Models” @ Hackerman Hall B17
Sep 1 @ 12:00 pm – 1:15 pm

Abstract

Large language models (LLMs) have demonstrated incredible power, but they also possess vulnerabilities that can lead to misuse and potential attacks. In this presentation, we will address two fundamental questions regarding the responsible utilization of LLMs: (1) How can we accurately identify AI-generated text? (2) What measures can safeguard the intellectual property of LLMs? We will introduce two recent watermarking techniques designed for text and models, respectively. Our discussion will encompass the theoretical underpinnings that ensure the correctness of watermark detection, along with robustness against evasion attacks. Furthermore, we will showcase empirical evidence validating their effectiveness. These findings establish a solid technical groundwork for policymakers, legal professionals, and generative AI practitioners alike.

Biography

Lei Li is an Assistant Professor in Language Technology Institute at Carnegie Mellon University. He received Ph.D. from Carnegie Mellon University School of Computer Science. He is a recipient of ACL 2021 Best Paper Award, CCF Young Elite Award in 2019, CCF distinguished speaker in 2017, Wu Wen-tsün AI prize in 2017, and 2012 ACM SIGKDD dissertation award (runner-up), and is recognized as Notable Area Chair of ICLR 2023. Previously, he was a faculty member at UC Santa Barbara. Prior to that,  he founded ByteDance AI Lab in 2016 and led its research in NLP, ML, Robotics, and Drug Discovery. He launched ByteDance’s machine translation system VolcTrans and AI writing system Xiaomingbot, serving one billion users.

Center for Language and Speech Processing