Jessy Li (University of Texas at Austin – Virtual Visit) “New Challenges in Text Simplification”
3400 N. Charles Street
Text simplification aims to help audiences read and understand a piece of text through lexical, syntactic, and discourse modifications, while remaining faithful to its central idea and meaning. Thanks to large-scale parallel corpora derived from Wikipedia and News, much of modern-day text simplification research focuses on sentence simplification, transforming original, more complex sentences into simplified versions. In this talk, I present new frontiers that focus on discourse operations. First, we consider the challenging task of simplifying highly technical language, in our case, medical texts. We introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics. We then propose a new metric to quantify stylistic differentiates between the two, and models for paragraph-level simplification. Second, we present the first data-driven study of inserting elaborations and explanations during simplification, and illustrate the richness and complexities of this phenomenon.