BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-20987@www.clsp.jhu.edu DTSTAMP:20240329T135649Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nWhile there is a vast amount of text written about ne arly any topic\, this is often difficult for someone unfamiliar with a spe cific field to understand. Automated text simplification aims to reduce th e complexity of a document\, making it more comprehensible to a broader au dience. Much of the research in this field has traditionally focused on si mplification sub-tasks\, such as lexical\, syntactic\, or sentence-level s implification. However\, current systems struggle to consistently produce high-quality simplifications. Phrase-based models tend to make too many po or transformations\; on the other hand\, recent neural models\, while prod ucing grammatical output\, often do not make all needed changes to the ori ginal text. In this thesis\, I discuss novel approaches for improving lexi cal and sentence-level simplification systems. Regarding sentence simplifi cation models\, after noting that encouraging diversity at inference time leads to significant improvements\, I take a closer look at the idea of di versity and perform an exhaustive comparison of diverse decoding technique s on other generation tasks. I also discuss the limitations in the framing of current simplification tasks\, which prevent these models from yet bei ng practically useful. Thus\, I also propose a retrieval-based reformulati on of the problem. Specifically\, starting with a document\, I identify co ncepts critical to understanding its content\, and then retrieve documents relevant for each concept\, re-ranking them based on the desired complexi ty level.\nBiography\nI’m a research scientist at the HLTCOE at Johns Hopk ins University. My primary research interests are in language generation\, diverse and constrained decoding\, and information retrieval. During my P hD I focused mainly on the task of text simplification\, and now am workin g on formulating structured prediction problems as end-to-end generation t asks. I received my PhD in July 2021 from the University of Pennsylvania w ith Chris Callison-Burch and Marianna Apidianaki. DTSTART;TZID=America/New_York:20211022T120000 DTEND;TZID=America/New_York:20211022T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Reno Kriz (HLTCOE – JHU) “Towards a Practically Useful Text Simplif ication System” URL:https://www.clsp.jhu.edu/events/reno-kriz-hltcoe-jhu-towards-a-practica lly-useful-text-simplification-system/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\nWhile there is a vast amount of text written about ne arly any topic\, this is often difficult for someone unfamiliar with a spe cific field to understand. Automated text simplification aims to reduce th e complexity of a document\, making it more comprehensible to a broader au dience. Much of the research in this field has traditionally focused on si mplification sub-tasks\, such as lexical\, syntactic\, or sentence-level s implification. However\, current systems struggle to consistently produce high-quality simplifications. Phrase-based models tend to make too many po or transformations\; on the other hand\, recent neural models\, while prod ucing grammatical output\, often do not make all needed changes to the ori ginal text. In this thesis\, I discuss novel approaches for improving lexi cal and sentence-level simplification systems. Regarding sentence simplifi cation models\, after noting that encouraging diversity at inference time leads to significant improvements\, I take a closer look at the idea of di versity and perform an exhaustive comparison of diverse decoding technique s on other generation tasks. I also discuss the limitations in the framing of current simplification tasks\, which prevent these models from yet bei ng practically useful. Thus\, I also propose a retrieval-based reformulati on of the problem. Specifically\, starting with a document\, I identify co ncepts critical to understanding its content\, and then retrieve documents relevant for each concept\, re-ranking them based on the desired complexi ty level.
\nBiography
\nI ’m a research scientist at the HLTCOE at Johns Hopkins University. My prim ary research interests are in language generation\, diverse and constraine d decoding\, and information retrieval. During my PhD I focused mainly on the task of text simplification\, and now am working on formulating struct ured prediction problems as end-to-end generation tasks. I received my PhD in July 2021 from the University of Pennsylvania with Chris Callison-Burc h and Marianna Apidianaki.
\n\n X-TAGS;LANGUAGE=en-US:2021\,Kriz\,October END:VEVENT BEGIN:VEVENT UID:ai1ec-21615@www.clsp.jhu.edu DTSTAMP:20240329T135649Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract\n\n\nWe consider a problem of data collection for sema ntically rich NLU tasks\, where detailed semantics of documents (or uttera nces) are captured using a complex meaning representation. Previously\, d ata collection for such tasks was either handled at the cost of extensive annotator training (e.g. in FrameNet or PropBank) or simplified meaning re presentation (e.g. in QA-SRL or Overnight). In this talk\, we present two systems [1\, 2] that aim to support fast\, accurate\, and expressive sema ntic annotations by pairing human workers with a trained model in the loop .\n\nThe first system\, called Guided K-best [1]\, is an annotation toolki t for conversational semantic parsing. Instead of typing annotations from scratch\, data specialists choose a correct parse from the K-best output of a few-shot prototyped model. As the K-best list can be large (e.g. K=1 00)\, we guide the annotators’ exploration of the K-best list via explaina ble hierarchical clustering. In addition\, we experiment with RoBERTa-bas ed reranking of the K-best list to recalibrate the few-shot model towards Accuracy@K. The final system allows to annotate data up to 35% faster tha n the standard\, non-guided K-best and improves the few-shot model’s top-1 accuracy by up to 18%. The second system\, called SchemaBlocks [2]\, is an annotation toolkit for schemas\, or structured descriptions of frequent real-world scenarios (e.g.\, cooking a meal). It represents schemas in t he annotation UI as nested blocks. Using a novel Causal ARM model\, we fu rther speed up the annotation process and guide data specialists towards e xpressive and diverse schemas. As part of this work\, we collect 232 sche mas\, evaluating their internal coherence and their coverage on large-scal e newswire corpora.\n\n\n DTSTART;TZID=America/New_York:20220311T120000 DTEND;TZID=America/New_York:20220311T131500 LOCATION:Virtual Seminar SEQUENCE:0 SUMMARY:Student Seminar – Anton Belyy “Systems for Human-AI Cooperation on Collecting Semantic Annotations” URL:https://www.clsp.jhu.edu/events/student-seminar-anton-belyy-systems-for -human-ai-cooperation-on-collecting-semantic-annotations/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\n\n X-TAGS;LANGUAGE=en-US:2022\,Belyy\,March END:VEVENT END:VCALENDAR