BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-22412@www.clsp.jhu.edu DTSTAMP:20240329T045330Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nDriven by the goal of eradicating language barriers o n a global scale\, machine translation has solidified itself as a key focu s of artificial intelligence research today. However\, such efforts have c oalesced around a small subset of languages\, leaving behind the vast majo rity of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe\, high-quality results\, all while ke eping ethical considerations in mind? In this talk\, I introduce No Langua ge Left Behind\, an initiative to break language barriers for low-resource languages. In No Language Left Behind\, we took on the low-resource langu age translation challenge by first contextualizing the need for translatio n support through exploratory interviews with native speakers. Then\, we c reated datasets and models aimed at narrowing the performance gap between low and high-resource languages. We proposed multiple architectural and tr aining improvements to counteract overfitting while training on thousands of tasks. Critically\, we evaluated the performance of over 40\,000 differ ent translation directions using a human-translated benchmark\, Flores-200 \, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety. Our model achiev es an improvement of 44% BLEU relative to the previous state-of-the-art\, laying important groundwork towards realizing a universal translation syst em in an open-source manner.\nBiography\nAngela is a research scientist at Meta AI Research in New York\, focusing on supporting efforts in speech a nd language research. Recent projects include No Language Left Behind (htt ps://ai.facebook.com/research/no-language-left-behind/) and Universal Spee ch Translation for Unwritten Languages (https://ai.facebook.com/blog/ai-tr anslation-hokkien/). Before translation\, Angela previously focused on res earch in on-device models for NLP and computer vision and text generation. DTSTART;TZID=America/New_York:20221118T120000 DTEND;TZID=America/New_York:20221118T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Angela Fan (Meta AI Research) “No Language Left Behind: Scaling Hu man-Centered Machine Translation” URL:https://www.clsp.jhu.edu/events/angela-fan-facebook/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\nDriven by the goal of eradicating language barriers o n a global scale\, machine translation has solidified itself as a key focu s of artificial intelligence research today. However\, such efforts have c oalesced around a small subset of languages\, leaving behind the vast majo rity of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe\, high-quality results\, all while ke eping ethical considerations in mind? In this talk\, I introduce No Langua ge Left Behind\, an initiative to break language barriers for low-resource languages. In No Language Left Behind\, we took on the low-resource langu age translation challenge by first contextualizing the need for translatio n support through exploratory interviews with native speakers. Then\, we c reated datasets and models aimed at narrowing the performance gap between low and high-resource languages. We proposed multiple architectural and tr aining improvements to counteract overfitting while training on thousands of tasks. Critically\, we evaluated the performance of over 40\,000 differ ent translation directions using a human-translated benchmark\, Flores-200 \, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety. Our model achiev es an improvement of 44% BLEU relative to the previous state-of-the-art\, laying important groundwork towards realizing a universal translation syst em in an open-source manner.
\nBiography
\nAngela is a research scientist at Meta AI Research in Ne w York\, focusing on supporting efforts in speech and language research. R ecent projects include No Language Left Behind (https://ai.facebook.com/research/no-language-left-be hind/) and Universal Speech Translation for Unwritten Languages (https://ai.facebook.com/blog/ai-translation -hokkien/). Before translation\, Angela previously focused on research in on-device models for NLP and computer vision and text generation.
\n\n X-TAGS;LANGUAGE=en-US:2022\,Fan\,November END:VEVENT BEGIN:VEVENT UID:ai1ec-23894@www.clsp.jhu.edu DTSTAMP:20240329T045330Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nThe use of NLP in the realm of financial technology i s broad and complex\, with applications ranging from sentiment analysis an d named entity recognition to question answering. Large Language Models (L LMs) have been shown to be effective on a variety of tasks\; however\, no LLM specialized for the financial domain has been reported in the literatu re. In this work\, we present BloombergGPT\, a 50 billion parameter langua ge model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg’s extensive data sources\, p erhaps the largest domain-specific dataset yet\, augmented with 345 billio n tokens from general-purpose datasets. We validate BloombergGPT on stand ard LLM benchmarks\, open financial benchmarks\, and a suite of internal b enchmarks that most accurately reflect our intended usage. Our mixed datas et training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general L LM benchmarks. Additionally\, we explain our modeling choices\, training p rocess\, and evaluation methodology.\nBiography\nMark Dredze is the John C Malone Professor of Computer Science at Johns Hopkins University and the Director of Research (Foundations of AI) for the JHU AI-X Foundry. He deve lops Artificial Intelligence Systems based on natural language processing and explores applications to public health and medicine.\nProf. Dredze is affiliated with the Malone Center for Engineering in Healthcare\, the Cent er for Language and Speech Processing\, among others. He holds a joint app ointment in the Biomedical Informatics & Data Science Section (BIDS)\, und er the Department of Medicine (DOM)\, Division of General Internal Medicin e (GIM) in the School of Medicine. He obtained his PhD from the University of Pennsylvania in 2009. DTSTART;TZID=America/New_York:20230918T120000 DTEND;TZID=America/New_York:20230918T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Mark Dredze (Johns Hopkins University) “BloombergGPT: A Large Langu age Model for Finance” URL:https://www.clsp.jhu.edu/events/mark-dredze-johns-hopkins-university/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\nThe use of NLP in the realm of financial technology i s broad and complex\, with applications ranging from sentiment analysis an d named entity recognition to question answering. Large Language Models (L LMs) have been shown to be effective on a variety of tasks\; however\, no LLM specialized for the financial domain has been reported in the literatu re. In this work\, we present BloombergGPT\, a 50 billion parameter langua ge model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg’s extensive data sources\, p erhaps the largest domain-specific dataset yet\, augmented with 345 billio n tokens from general-purpose datasets. We validate BloombergGPT on stand ard LLM benchmarks\, open financial benchmarks\, and a suite of internal b enchmarks that most accurately reflect our intended usage. Our mixed datas et training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general L LM benchmarks. Additionally\, we explain our modeling choices\, training p rocess\, and evaluation methodology.
\nBiography
\nMark Dredze is the John C Malone Professor of Computer Science at Jo hns Hopkins University and the Director of Research (Foundations of AI) fo r the JHU AI-X Foundry. He develops Artificial Intelligence Systems based on natural language processing and explores applications to public health and medicine.
\nProf. Dredze is affiliated with the Malone Center fo r Engineering in Healthcare\, the Center for Language and Speech Processin g\, among others. He holds a joint appointment in the Bio medical Informatics & Data Science Section (< span class='il'>BIDS)\, under the Department of Medicine (DOM)\, Di vision of General Internal Medicine (GIM) in the School of Medicine. He ob tained his PhD from the University of Pennsylvania in 2009.
\n HTML> X-TAGS;LANGUAGE=en-US:2023\,Dredze\,September END:VEVENT END:VCALENDAR