BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-21023@www.clsp.jhu.edu DTSTAMP:20240328T101244Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nSpeech data is notoriously difficult to work with due to a variety of codecs\, lengths of recordings\, and meta-data formats. W e present Lhotse\, a speech data representation library that draws upon le ssons learned from Kaldi speech recognition toolkit and brings its concept s into the modern deep learning ecosystem. Lhotse provides a common JSON d escription format with corresponding Python classes and data preparation r ecipes for over 30 popular speech corpora. Various datasets can be easily combined together and re-purposed for different tasks. The library handles multi-channel recordings\, long recordings\, local and cloud storage\, la zy and on-the-fly operations amongst other features. We introduce Cut and CutSet concepts\, which simplify common data wrangling tasks for audio and help incorporate acoustic context of speech utterances. Finally\, we show how Lhotse leverages PyTorch data API abstractions and adopts them to han dle speech data for deep learning.\nBiography\nPiotr Zelasko is an assista nt research scientist in the Center for Language and Speech Processing (CL SP) who specializes in automatic speech recognition (ASR) and spoken langu age understanding (SLU). His current research focuses on applying multilin gual and crosslingual speech recognition systems to categorize the phoneti c inventory of a previously unknown language and on improving defenses aga inst adversarial attacks on both speaker identification and automatic spee ch recognition systems. He is also addressing the question of how to struc ture a spontaneous conversation into high-level semantic units such as dia log acts or topics. Finally\, he is working on Lhotse + K2\, the next-gene ration speech processing research software ecosystem. Before joining Johns Hopkins\, Zelasko worked as a machine learning consultant for Avaya (2017 -2019)\, and as a machine learning engineer for Techmo (2015-2017). Zelask o received his PhD (2019) in electronics engineering\, as well as his mast er’s (2014) and undergraduate degrees (2013) in acoustic engineering from AGH University of Science and Technology in Kraków\, Poland. DTSTART;TZID=America/New_York:20211029T120000 DTEND;TZID=America/New_York:20211029T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore MD 21218 SEQUENCE:0 SUMMARY:Piotr Zelasko (CLSP at JHU) “Lhotse: a speech data representation l ibrary for the modern deep learning ecosystem” URL:https://www.clsp.jhu.edu/events/piotr-zelasko-clsp-at-jhu-lhotse-a-spee ch-data-representation-library-for-the-modern-deep-learning-ecosystem/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\nSpeech data is notoriously difficult t o work with due to a variety of codecs\, lengths of recordings\, and meta- data formats. We present Lhotse\, a speech data representation library tha t draws upon lessons learned from Kaldi speech recognition toolkit and bri ngs its concepts into the modern deep learning ecosystem. Lhotse provides a common JSON description format with corresponding Python classes and dat a preparation recipes for over 30 popular speech corpora. Various datasets can be easily combined together and re-purposed for different tasks. The library handles multi-channel recordings\, long recordings\, local and clo ud storage\, lazy and on-the-fly operations amongst other features. We int roduce Cut and CutSet concepts\, which simplify common data wrangling task s for audio and help incorporate acoustic context of speech utterances. Fi nally\, we show how Lhotse leverages PyTorch data API abstractions and ado pts them to handle speech data for deep learning.
\nB iography
\nPiotr Zelasko is an assistant research scientist in the Center for Language and Speech Processing (CLSP) who specializes i n automatic speech recognition (ASR) and spoken language understanding (SL U). His current research focuses on applying multilingual and crosslingual speech recognition systems to categorize the phonetic inventory of a prev iously unknown language and on improving defenses against adversarial atta cks on both speaker identification and automatic speech recognition system s. He is also addressing the question of how to structure a spontaneous co nversation into high-level semantic units such as dialog acts or topics. F inally\, he is working on Lhotse + K2\, the next-generation speech process ing research software ecosystem. Before joining Johns Hopkins\, Zelasko wo rked as a machine learning consultant for Avaya (2017-2019)\, and as a mac hine learning engineer for Techmo (2015-2017). Zelasko received his PhD (2 019) in electronics engineering\, as well as his master’s (2014) and under graduate degrees (2013) in acoustic engineering from AGH University of Sci ence and Technology in Kraków\, Poland.
\n X-TAGS;LANGUAGE=en-US:2021\,October\,Zelasko END:VEVENT BEGIN:VEVENT UID:ai1ec-22412@www.clsp.jhu.edu DTSTAMP:20240328T101244Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nDriven by the goal of eradicating language barriers o n a global scale\, machine translation has solidified itself as a key focu s of artificial intelligence research today. However\, such efforts have c oalesced around a small subset of languages\, leaving behind the vast majo rity of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe\, high-quality results\, all while ke eping ethical considerations in mind? In this talk\, I introduce No Langua ge Left Behind\, an initiative to break language barriers for low-resource languages. In No Language Left Behind\, we took on the low-resource langu age translation challenge by first contextualizing the need for translatio n support through exploratory interviews with native speakers. Then\, we c reated datasets and models aimed at narrowing the performance gap between low and high-resource languages. We proposed multiple architectural and tr aining improvements to counteract overfitting while training on thousands of tasks. Critically\, we evaluated the performance of over 40\,000 differ ent translation directions using a human-translated benchmark\, Flores-200 \, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety. Our model achiev es an improvement of 44% BLEU relative to the previous state-of-the-art\, laying important groundwork towards realizing a universal translation syst em in an open-source manner.\nBiography\nAngela is a research scientist at Meta AI Research in New York\, focusing on supporting efforts in speech a nd language research. Recent projects include No Language Left Behind (htt ps://ai.facebook.com/research/no-language-left-behind/) and Universal Spee ch Translation for Unwritten Languages (https://ai.facebook.com/blog/ai-tr anslation-hokkien/). Before translation\, Angela previously focused on res earch in on-device models for NLP and computer vision and text generation. DTSTART;TZID=America/New_York:20221118T120000 DTEND;TZID=America/New_York:20221118T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Angela Fan (Meta AI Research) “No Language Left Behind: Scaling Hu man-Centered Machine Translation” URL:https://www.clsp.jhu.edu/events/angela-fan-facebook/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nDriven by the goal of eradicating language barriers o n a global scale\, machine translation has solidified itself as a key focu s of artificial intelligence research today. However\, such efforts have c oalesced around a small subset of languages\, leaving behind the vast majo rity of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe\, high-quality results\, all while ke eping ethical considerations in mind? In this talk\, I introduce No Langua ge Left Behind\, an initiative to break language barriers for low-resource languages. In No Language Left Behind\, we took on the low-resource langu age translation challenge by first contextualizing the need for translatio n support through exploratory interviews with native speakers. Then\, we c reated datasets and models aimed at narrowing the performance gap between low and high-resource languages. We proposed multiple architectural and tr aining improvements to counteract overfitting while training on thousands of tasks. Critically\, we evaluated the performance of over 40\,000 differ ent translation directions using a human-translated benchmark\, Flores-200 \, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety. Our model achiev es an improvement of 44% BLEU relative to the previous state-of-the-art\, laying important groundwork towards realizing a universal translation syst em in an open-source manner.
\nBiography
\nAngela is a research scientist at Meta AI Research in Ne w York\, focusing on supporting efforts in speech and language research. R ecent projects include No Language Left Behind (https://ai.facebook.com/research/no-language-left-be hind/) and Universal Speech Translation for Unwritten Languages (https://ai.facebook.com/blog/ai-translation -hokkien/). Before translation\, Angela previously focused on research in on-device models for NLP and computer vision and text generation.
\n\n X-TAGS;LANGUAGE=en-US:2022\,Fan\,November END:VEVENT END:VCALENDAR