Reno Kriz (HLTCOE – JHU) “Towards a Practically Useful Text Simplification System”
3400 N. Charles Street
While there is a vast amount of text written about nearly any topic, this is often difficult for someone unfamiliar with a specific field to understand. Automated text simplification aims to reduce the complexity of a document, making it more comprehensible to a broader audience. Much of the research in this field has traditionally focused on simplification sub-tasks, such as lexical, syntactic, or sentence-level simplification. However, current systems struggle to consistently produce high-quality simplifications. Phrase-based models tend to make too many poor transformations; on the other hand, recent neural models, while producing grammatical output, often do not make all needed changes to the original text. In this thesis, I discuss novel approaches for improving lexical and sentence-level simplification systems. Regarding sentence simplification models, after noting that encouraging diversity at inference time leads to significant improvements, I take a closer look at the idea of diversity and perform an exhaustive comparison of diverse decoding techniques on other generation tasks. I also discuss the limitations in the framing of current simplification tasks, which prevent these models from yet being practically useful. Thus, I also propose a retrieval-based reformulation of the problem. Specifically, starting with a document, I identify concepts critical to understanding its content, and then retrieve documents relevant for each concept, re-ranking them based on the desired complexity level.
I’m a research scientist at the HLTCOE at Johns Hopkins University. My primary research interests are in language generation, diverse and constrained decoding, and information retrieval. During my PhD I focused mainly on the task of text simplification, and now am working on formulating structured prediction problems as end-to-end generation tasks. I received my PhD in July 2021 from the University of Pennsylvania with Chris Callison-Burch and Marianna Apidianaki.