Rajiv Jain and Chris Tensmeyer (Adobe) “Document Intelligence at Adobe Research”
3400 N. Charles Street
Baltimore
MD 21218
Abstract
Visually rich documents (scanned or digital) remain important for many consumer and business use cases. During this talk we will share recent work from our team in the Document Intelligence Lab of Adobe Research to understand, create, and interact with these documents. First, we’ll share a series of work on building models to decompose and understand the structure of documents to support use cases around document analysis and accessibility. Next, we’ll explore document semantic understanding for a project where we convert natural language contract clauses to code to support business automation. Finally, we’ll discuss DocEdit, a model and dataset that enables editing structured documents from natural language.
BIOS:
Rajiv Jain is a Senior Research Scientist in the Document Intelligence Lab in Adobe Research, where his research focuses on understanding the layout, content, and interaction with documents. Prior to joining Adobe, Rajiv was a consultant at DARPA, where he worked on the Media Forensics Program to secure digital imagery. He previously served for 10 years as a researcher for the Department of Defense where he worked on projects around large scale systems, computer vision, and network security. He received his PhD in computer science from the University of Maryland, College Park working in the field of document image analysis and retrieval.
Chris Tensmeyer primarily focuses on multi-modal document layout and content understanding as a Research Scientist in the Document Intelligence Lab of Adobe Research. Since joining Adobe 5 years ago, his work has directly impacted popular Adobe features such as mobile Acrobat Liquid Mode, PDF table extraction, handwriting recognition, and scanned document detection. Other research interests include general Computer Vision and Deep Learning. He received his PhD in Computer Science from Brigham Young University on the topic of Deep Learning for Document Image Analysis.