BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-22375@www.clsp.jhu.edu DTSTAMP:20240328T191529Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nI will present our work on data augmentation using st yle transfer as a way to improve domain adaptation in sequence labeling ta sks. The target domain is social media data\, and the task is named entity recognition (NER). The premise is that we can transform the labelled out of domain data into something that stylistically is more closely related t o the target data. Then we can train a model on a combination of the gener ated data and the smaller amount of in domain data to improve NER predicti on performance. I will show recent empirical results on these efforts.\nIf time allows\, I will also give an overview of other research projects I’m currently leading at RiTUAL (Research in Text Understanding and Analysis of Language) lab. The common thread among all these research problems is t he scarcity of labeled data.\nBiography\nThamar Solorio is a Professor of Computer Science at the University of Houston (UH). She holds graduate deg rees in Computer Science from the Instituto Nacional de Astrofísica\, Ópti ca y Electrónica\, in Puebla\, Mexico. Her research interests include info rmation extraction from social media data\, enabling technology for code-s witched data\, stylistic modeling of text\, and more recently multimodal a pproaches for online content understanding. She is the director and founde r of the RiTUAL Lab at UH. She is the recipient of an NSF CAREER award for her work on authorship attribution\, and recipient of the 2014 Emerging L eader ABIE Award in Honor of Denice Denton. She is currently serving a sec ond term as an elected board member of the North American Chapter of the A ssociation of Computational Linguistics and was PC co-chair for NAACL 2019 . She recently joined the team of Editors in Chief for the ACL Rolling Rev iew (ARR) system. Her research is currently funded by the NSF and by ADOBE . DTSTART;TZID=America/New_York:20220923T120000 DTEND;TZID=America/New_York:20220923T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Thamar Solorio (University of Houston) “Style Transfer for Data Aug mentation in Sequence Labeling Tasks” URL:https://www.clsp.jhu.edu/events/thamar-solorio-university-of-houston-st yle-transfer-for-data-augmentation-in-sequence-labeling-tasks/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\nI will present our work on data a ugmentation using style transfer as a way to improve domain adaptation in sequence labeling tasks. The target domain is social media data\, and the task is named entity recognition (NER). The premise is that we can transfo rm the labelled out of domain data into something that stylistically is mo re closely related to the target data. Then we can train a model on a comb ination of the generated data and the smaller amount of in domain data to improve NER prediction performance. I will show recent empirical results o n these efforts.
\nIf time allows\, I will also give an overview of other research projects I’m currently leading at RiTUA L (Research in Text Understanding and Analysis of Language) lab. The commo n thread among all these research problems is the scarcity of labeled data .
\nBiography
\nThamar Solorio is a Professor of Computer Science at the Univer sity of Houston (UH). She holds graduate degrees in Computer Science from the Instituto Nacional de Astrofísica\, Óptica y Electrónica\, in Puebla\, Mexico. Her research interests include information extraction from social media data\, enabling technology for code-switched data\, stylistic model ing of text\, and more recently multimodal approaches for online content u nderstanding. She is the director and founder of the RiTUAL Lab at UH. She is the recipient of an NSF CAREER award for her work on authorship attrib ution\, and recipient of the 2014 Emerging Leader ABIE Award in Honor of D enice Denton. She is currently serving a second term as an elected board m ember of the North American Chapter of the Association of Computational Li nguistics and was PC co-chair for NAACL 2019. She recently joined the team of Editors in Chief for the ACL Rolling Review (ARR) system. Her research is currently funded by the NSF and by ADOBE.
\n X-TAGS;LANGUAGE=en-US:2022\,September\,Solorio END:VEVENT BEGIN:VEVENT UID:ai1ec-24465@www.clsp.jhu.edu DTSTAMP:20240328T191529Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nLarge Language Models (LLMs) have demonstrated remark able capabilities across various domains. However\, it is still very chall enging to build highly-reliable applications with LLMs that support specia lized use cases. LLMs trained on web data often excel at capturing general language patterns\, but they could struggle to support specialized domain s and personalized user needs. Moreover\, LLMs can produce errors that are deceptively plausible\, making them potentially dangerous for high-trust scenarios. In this talk\, I will discuss some of our recent efforts in add ressing these challenges with data-efficient tuning methods and a novel fa ctuality evaluation framework. Specifically\, my talk will focus on buildi ng multilingual applications\, one crucial use case often characterized by limited tuning and evaluation data.\nBio\nXinyi(Cindy) Wang is a research scientist at Google DeepMind working on Large Language Models(LLM) and it s application to generative question-answering. She has worked on multilin gual instruction-tuning for Gemini and multilingual generative models used in Google search. Before Google DeepMind\, Cindy Wang obtained her PhD de gree in Language Technologies at Carnegie Mellon University. During her Ph D\, she mainly worked on developing data-efficient natural language proces sing~(NLP) systems. She has made several contributions in data selection\, data representation\, and model adaptation for multilingual NLP. DTSTART;TZID=America/New_York:20240308T120000 DTEND;TZID=America/New_York:20240308T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Cindy Wang (Google DeepMind) “Building Data-Efficient and Reliable Applications with Large Language Models” URL:https://www.clsp.jhu.edu/events/cindy-wang-google-deepmind-building-dat a-efficient-and-reliable-applications-with-large-language-models/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nLarge Language Models (LLMs) have demonstrated remark able capabilities across various domains. However\, it is still very chall enging to build highly-reliable applications with LLMs that support specia lized use cases. LLMs trained on web data often excel at capturing general language patterns\, but they could struggle to support specialized domain s and personalized user needs. Moreover\, LLMs can produce errors that are deceptively plausible\, making them potentially dangerous for high-trust scenarios. In this talk\, I will discuss some of our recent efforts in add ressing these challenges with data-efficient tuning methods and a novel fa ctuality evaluation framework. Specifically\, my talk will focus on buildi ng multilingual applications\, one crucial use case often characterized by limited tuning and evaluation data.
\nBio
\nXinyi(Cindy) Wang is a research scientist at Google DeepMind working on La rge Language Models(LLM) and its application to generative question-answer ing. She has worked on multilingual instruction-tuning for Gemini and mult ilingual generative models used in Google search. Before Google DeepMind\, Cindy Wang obtained her PhD degree in Language Technologies at Carnegie M ellon University. During her PhD\, she mainly worked on developing data-ef ficient natural language processing~(NLP) systems. She has made several co ntributions in data selection\, data representation\, and model adaptation for multilingual NLP.
\n X-TAGS;LANGUAGE=en-US:2024\,March\,Wang END:VEVENT END:VCALENDAR