BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-20987@www.clsp.jhu.edu DTSTAMP:20240329T141031Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:
Abstract
\nWhile there is a vast amou nt of text written about nearly any topic\, this is often difficult for so meone unfamiliar with a specific field to understand. Automated text simpl ification aims to reduce the complexity of a document\, making it more com prehensible to a broader audience. Much of the research in this field has traditionally focused on simplification sub-tasks\, such as lexical\, synt actic\, or sentence-level simplification. However\, current systems strugg le to consistently produce high-quality simplifications. Phrase-based mode ls tend to make too many poor transformations\; on the other hand\, recent neural models\, while producing grammatical output\, often do not make al l needed changes to the original text. In this thesis\, I discuss novel ap proaches for improving lexical and sentence-level simplification systems. Regarding sentence simplification models\, after noting that encouraging d iversity at inference time leads to significant improvements\, I take a cl oser look at the idea of diversity and perform an exhaustive comparison of diverse decoding techniques on other generation tasks. I also discuss the limitations in the framing of current simplification tasks\, which preven t these models from yet being practically useful. Thus\, I also propose a retrieval-based reformulation of the problem. Specifically\, starting with a document\, I identify concepts critical to understanding its content\, and then retrieve documents relevant for each concept\, re-ranking them ba sed on the desired complexity level.
\nBiography
\nI’m a research scientist at the HLTCOE at Johns Hopkins University. My primary research interests are in language generati on\, diverse and constrained decoding\, and information retrieval. During my PhD I focused mainly on the task of text simplification\, and now am wo rking on formulating structured prediction problems as end-to-end generati on tasks. I received my PhD in July 2021 from the University of Pennsylvan ia with Chris Callison-Burch and Marianna Apidianaki.
\nDTSTART;TZID=America/New_York:20211022T120000 DTEND;TZID=America/New_York:20211022T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Reno Kriz (HLTCOE – JHU) “Towards a Practically Useful Text Simplif ication System” URL:https://www.clsp.jhu.edu/events/reno-kriz-hltcoe-jhu-towards-a-practica lly-useful-text-simplification-system/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,Kriz\,October END:VEVENT BEGIN:VEVENT UID:ai1ec-21068@www.clsp.jhu.edu DTSTAMP:20240329T141031Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION: DTSTART;TZID=America/New_York:20211203T120000 DTEND;TZID=America/New_York:20211203T131500 LOCATION:Hackerman HallB17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Eric Ringger (Zillow Group) URL:https://www.clsp.jhu.edu/events/eric-ringger-zillow-group/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2021\,December\,Ringger END:VEVENT BEGIN:VEVENT UID:ai1ec-21270@www.clsp.jhu.edu DTSTAMP:20240329T141031Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:
Abstract
\nSocial media allows resear chers to track societal and cultural changes over time based on language a nalysis tools. Many of these tools rely on statistical algorithms which ne ed to be tuned to specific types of language. Recent studies have question ed the robustness of longitudinal analyses based on statistical methods du e to issues of temporal bias and semantic shift. To what extent are change s in semantics over time affecting the reliability of longitudinal analyse s? We examine this question through a case study: understanding shifts in mental health during the course of the COVID-19 pandemic. We demonstrate t hat a recently-introduced method for measuring semantic shift may be used to proactively identify failure points of language-based models and improv e predictive generalization over time. Ultimately\, we find that these ana lyses are critical to producing accurate longitudinal studies of social me dia.
DTSTART;TZID=America/New_York:20220207T120000 DTEND;TZID=America/New_York:20220207T131500 LOCATION:In Person or Virtual Option @ https://wse.zoom.us/j/96735183473 @ 234 Ames Hall\, 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Keith Harrigian “The Problem of Semantic Shift in Longitudinal Monitoring of Social Media: A Case Study on Mental Health d uring the COVID-19 Pandemic” URL:https://www.clsp.jhu.edu/events/student-seminar-keith-harrigian-the-pro blem-of-semantic-shift-in-longitudinal-monitoring-of-social-media-a-case-s tudy-on-mental-health-during-the-covid-19-pandemic/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,February\,Harrigian END:VEVENT BEGIN:VEVENT UID:ai1ec-21275@www.clsp.jhu.edu DTSTAMP:20240329T141031Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\n\n\n\n\nAutomatic discovery of phon e or word-like units is one of the core objectives in zero-resource speech processing. Recent attempts employ contrastive predictive coding (CPC)\, where the model learns representations by predicting the next frame given past context. However\, CPC only looks at the audio signal’s structure at the frame level. The speech structure exists beyond frame-level\, i.e.\, a t phone level or even higher. We propose a segmental contrastive predictiv e coding (SCPC) framework to learn from the signal structure at both the f rame and phone levels.\n\n\nSCPC is a hierarchical model with three stages trained in an end-to-end m anner. In the first stage\, the model predicts future feature frames and e xtracts frame-level representation from the raw waveform. In the second st age\, a differentiable boundary detector finds variable-length segments. I n the last stage\, the model predicts future segments to learn segment rep resentations. Experiments show that our model outperforms existing phone a nd word segmentation methods on TIMIT and Buckeye datasets.
Abstract
\nSocial media allows resear chers to track societal and cultural changes over time based on language a nalysis tools. Many of these tools rely on statistical algorithms which ne ed to be tuned to specific types of language. Recent studies have shown th e absence of appropriate tuning\, specifically in the presence of semantic shift\, can hinder robustness of the underlying methods. However\, little is known about the practical effect this sensitivity may have on downstre am longitudinal analyses. We explore this gap in the literature through a timely case study: understanding shifts in depression during the course of the COVID-19 pandemic. We find that inclusion of only a small number of s emantically-unstable features can promote significant changes in longitudi nal estimates of our target outcome. At the same time\, we demonstrate tha t a recently-introduced method for measuring semantic shift may be used to proactively identify failure points of language-based models and\, in tur n\, improve predictive generalization.
DTSTART;TZID=America/New_York:20220318T120000 DTEND;TZID=America/New_York:20220318T131500 LOCATION:Ames Hall 234 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Student Seminar – Keith Harrigian “The Problem of Semantic Shift in Longitudinal Monitoring of Social Media” URL:https://www.clsp.jhu.edu/events/student-seminar-keith-harrigian-the-pro blem-of-semantic-shift-in-longitudinal-monitoring-of-social-media/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2022\,Harrigian\,March END:VEVENT BEGIN:VEVENT UID:ai1ec-24457@www.clsp.jhu.edu DTSTAMP:20240329T141031Z CATEGORIES;LANGUAGE=en-US:Student Seminars CONTACT: DESCRIPTION:Abstract
\nAs artificial intelligence (AI) continues to rapidly expand into existing healthcare infrastructure – e.g.\, clinical decision support\, administrative tasks\, and public hea lth surveillance – it is perhaps more important than ever to reflect on th e broader purpose of such systems. While much focus has been on the potent ial for this technology to improve general health outcomes\, there also ex ists a significant\, but understated\, opportunity to use this technology to address health-related disparities. Accomplishing the latter depends no t only on our ability to effectively identify addressable areas of systemi c inequality and translate them into tasks that are machine learnable\, bu t also our ability to measure\, interpret\, and counteract barriers in tra ining data that may inhibit robustness to distribution shift upon deployme nt (i.e.\, new populations\, temporal dynamics). In this talk\, we will di scuss progress made along both of these dimensions. We will begin by provi ding background on the state of AI for promoting health equity. Then\, we will present results from a recent clinical phenotyping project and discus s their implication on prevailing views regarding language model robustnes s in clinical applications. Finally\, we will showcase ongoing efforts to proactively address systemic inequality in healthcare by identifying and c haracterizing stigmatizing language in medical records.
DTSTART;TZID=America/New_York:20240226T120000 DTEND;TZID=America/New_York:20240226T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Keith Harrigian (JHU) “Fighting Bias From Bias: Robust Natural Lang uage Processing Techniques to Promote Health Equity” URL:https://www.clsp.jhu.edu/events/keith-harrigian-jhu-fighting-bias-from- bias-robust-natural-language-processing-techniques-to-promote-health-equit y/ X-COST-TYPE:free X-TAGS;LANGUAGE=en-US:2024\,February\,Harrigian END:VEVENT END:VCALENDAR