Principles, Practice, and Impact of Multimodal Interaction – Philip R. Cohen (Oregon Graduate Institute of Science and Technology)

February 9, 1999 all-day

A new generation of systems is emerging in which the user is able to employ natural communication modalities, including speech and pen-based gesture, in addition to the usual graphical user interface technologies. Multimodal systems incorporating pen and voice communication are advantageous for both very small and very large devices, for spatially-oriented applications, and for contexts emphasizing user mobility. These advantages will be illustrated through QuickSet — a handheld, collaborative, multimodal system that allows continuous speech and pen-based gesturing as input. QuickSet uses a distributed agent architecture, runs on personal computers, and is scalable from wearable to wall-sized systems. Among QuickSet’s applications are initializing military simulations, control of virtual reality environments, logistics planning, and medical informatics. The core of QuickSet is a principled method for combining information derived from different modes. We discuss how a set of meaning fragments produced by recognizers for multiple modes can be unified to determine the best joint interpretation. This unification process will be shown to support multimodal discourse and mutual disambiguation of those meaning fragments. Finally, to assess the impact of multimodal interaction, a study will be described in which expert users completed map-based military tasks using both a graphical user interface and QuickSet. In brief, with the multimodal interface, users positioned entities on a map 3 – 8 times faster than with the graphical user interface. Multimodal interaction was preferred by all users, particularly for its efficiency and for its precision in drawing. To illustrate the QuickSet technology and its applications, a video and demonstration of the system will be given.
Dr. Phil Cohen is currently Professor and Co-director of the Center for Human-Computer Communication at the Oregon Graduate Institute. Prior to joining OGI, he was a senior computer scientist with the Artificial Intelligence Center of SRI International. His research interests include multimodal human-computer interaction, multiagent architectures, dialogue, computational linguistics, theories of communication and collaboration, mobile computing, and collaboration technology. His research is presently supported by DARPA, ONR, NSF, Microsoft, Intel, Boeing, and France Telecom.

Center for Language and Speech Processing