Channel Combination Techniques for Far Field Automatic Speech Recognition John McDonough Institut für Theoretische Informatik Universität Karlsruhe Karlsruhe, Germany Interest within the automatic speech recognition (ASR) research community has recently focused on the recognition of speech captured with one or more microphones located in the medium field, rather than being mounted on a headset and positioned next to the speaker's mouth. The capacity to reliably recognize such speech is a primary requirement in making ASR a viable modality for so-called \emph{ubiquitous computing}. Such far field ASR is a natural application for the use of multiple microphones whose signals can be combined in different ways: On the signal side, combination can be accomplished by beamforming using a microphone array with a known geometry, or by blind channel combination techniques if the geometry is unknown. On the word hypothesis side, combination can be achieved through confusion network combination. In this talk, we describe the components of a far field ASR system, including the source localizer, beamformer, speech activity detector, and ASR engine. We also compare the effectiveness of the several signal combination techniques, and compare their performance to that achieved with a close talking microphone. Finally, we describe a set of extensions modules to the Python scripting language that implement the algorithms described here. Joint work with Hazim Kemal Ekenel, Tobias Gehrig, Ulrich Klee, Kai Nickel and Matthias Wölfel