BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-20723@www.clsp.jhu.edu DTSTAMP:20240328T222620Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nText simplification aims t o help audiences read and understand a piece of text through lexical\, syn tactic\, and discourse modifications\, while remaining faithful to its cen tral idea and meaning. Thanks to large-scale parallel corpora derived from Wikipedia and News\, much of modern-day text simplification research focu ses on sentence simplification\, transforming original\, more complex sent ences into simplified versions. In this talk\, I present new frontiers tha t focus on discourse operations. First\, we consider the challenging task of simplifying highly technical language\, in our case\, medical texts. We introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinic al topics. We then propose a new metric to quantify stylistic differentiat es between the two\, and models for paragraph-level simplification. Second \, we present the first data-driven study of inserting elaborations and ex planations during simplification\, and illustrate the richness and complex ities of this phenomenon.
\nBiography
\nAbstract
\nSpeech communications repr esents a core domain for education\, team problem solving\, social engagem ent\, and business interactions. The ability for Speech Technology to extr act layers of knowledge and assess engagement content represents the next generation of advanced speech solutions. Today\, the emergence of BIG DATA \, Machine Learning\, as well as voice enabled speech systems have require d the need for effective voice capture and automatic speech/speaker recogn ition. The ability to employ speech and language technology to assess huma n-to-human interactions offers new research paradigms having profound impa ct on assessing human interaction. In this talk\, we will focus on big dat a naturalistic audio processing relating to (i) child learning spaces\, an d (ii) the NASA APOLLO lunar missions. ML based technology advancements in clude automatic audio diarization\, speech recognition\, and speaker recog nition. Child-Teacher based assessment of conversational interactions are explored\, including keyword and “WH-word” (e.g.\, who\, what\, etc.). Dia rization processing solutions are applied to both classroom/learning space child speech\, as well as massive APOLLO data. CRSS-UTDallas is expanding our original Apollo-11 corpus\, resulting in a massive multi-track audio processing challenge to make available 150\,000hrs of Apollo mission data to be shared with science communities: (i) speech/language technology\, (i i) STEM/science and team-based researchers\, and (iii) education/historica l/archiving specialists. Our goals here are to provide resources which all ow to better understand how people work/learn collaboratively together. Fo r Apollo\, to accomplish one of mankind’s greatest scientific/technologica l challenges in the last century.
\nBiography
\nJohn H.L. Hansen\, received Ph.D. & M.S. degrees from Georgia Institute of Technology\, and B.S.E.E. from Rutgers Univ. He joined Univ. of Texas at Dallas (UTDallas) in 2005\, where he currently serves as Associate Dean for Research\, Prof. of ECE\, Distinguished Univ. Chair in Telecom. Engin eering\, and directs Center for Robust Speech Systems (CRSS). He is an ISC A Fellow\, IEEE Fellow\, and has served as Member and TC-Chair of IEEE Sig nal Proc. Society\, Speech & Language Proc. Tech. Comm.(SLTC)\, and Techni cal Advisor to U.S. Delegate for NATO (IST/TG-01). He served as ISCA Presi dent (2017-21)\, continues to serve on ISCA Board (2015-23) as Treasurer\, has supervised 99 PhD/MS thesis candidates (EE\,CE\,BME\,TE\,CS\,Ling.\,C og.Sci.\,Spch.Sci.\,Hear.Sci)\, was recipient of 2020 UT-Dallas Provost’s Award for Grad. PhD Research Mentoring\; author/co-author of 865 journal/c onference papers including 14 textbooks in the field of speech/language/he aring processing & technology including coauthor of textbook Discrete-Time Processing of Speech Signals\, (IEEE Press\, 2000)\, and lead author of t he report “The Impact of Speech Under ‘Stress’ on Military Speech Technolo gy\,” (NATO RTO-TR-10\, 2000). He served as Organizer\, Chair/Co-Chair/Tec h.Chair for ISCA INTERSPEECH-2022\, IEEE ICASSP-2010\, IEEE SLT-2014\, ISC A INTERSPEECH-2002\, and Tech. Chair for IEEE ICASSP-2024. He received the 2022 IEEE Signal Processing Society Leo Beranek MERITORIOUS SERVICE Award .
\nDTSTART;TZID=America/New_York:20230303T120000 DTEND;TZID=America/New_York:20230303T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:John Hansen (University of Texas at Dallas) “Challenges and Advance ments in Speaker Diarization & Recognition for Naturalistic Data Streams” URL:https://www.clsp.jhu.edu/events/john-hansen-university-of-texas-at-dall as/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Hansen\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-23882@www.clsp.jhu.edu DTSTAMP:20240328T222620Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nLarge language models (LLM s) have demonstrated incredible power\, but they also possess vulnerabilit ies that can lead to misuse and potential attacks. In this presentation\, we will address two fundamental questions regarding the responsible utiliz ation of LLMs: (1) How can we accurately identify AI-generated text? (2) W hat measures can safeguard the intellectual property of LLMs? We will intr oduce two recent watermarking techniques designed for text and models\, re spectively. Our discussion will encompass the theoretical underpinnings th at ensure the correctness of watermark detection\, along with robustness a gainst evasion attacks. Furthermore\, we will showcase empirical evidence validating their effectiveness. These findings establish a solid technical groundwork for policymakers\, legal professionals\, and generative AI pra ctitioners alike.
\nBiography
\nLei Li is an Assistant Professor in Language Technology Institute at Carnegie Mellon Un iversity. He received Ph.D. from Carnegie Mellon University School of Comp uter Science. He is a recipient of ACL 2021 Best Paper Award\, CCF Young E lite Award in 2019\, CCF distinguished speaker in 2017\, Wu Wen-tsün AI pr ize in 2017\, and 2012 ACM SIGKDD dissertation award (runner-up)\, and is recognized as Notable Area Chair of ICLR 2023. Previously\, he was a facul ty member at UC Santa Barbara. Prior to that\, he founded ByteDance AI La b in 2016 and led its research in NLP\, ML\, Robotics\, and Drug Discovery . He launched ByteDance’s machine translation system VolcTrans and AI writ ing system Xiaomingbot\, serving one billion users.
DTSTART;TZID=America/New_York:20230901T120000 DTEND;TZID=America/New_York:20230901T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Lei Li (Carnegie Mellon University) “Empowering Responsible Use of Large Language Models” URL:https://www.clsp.jhu.edu/events/lei-li-carnegie-mellon-university-empow ering-responsible-use-of-large-language-models/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2023\,Li\,September END:VEVENT END:VCALENDAR