CLSP Homepage : Workshop Homepage
Workshop 2007
Guest Lecture Friday, September 5, 2008


Seminar
Information

The Role of the Cochlea in Human Speech Recognition - Jont Allen


  • Abstract:

    The most important communication signal is human speech. It is useful to think of speech communication in terms of Claude Shannon's information theory channel model. When viewed as such, it soon becomes clear that the most complex part of speech communication is in auditory system (the receiver). In my opinion, relatively little is know about how the human auditory system decodes speech. My research has studied this problem using simple isolated natural consonant and vowel (CV) confusions, as a function of the speech to noise ratio (SNR), with several types of masking noise. In one type of experiment we selectively remove islands of speech, and then correlate the resulting modified speech against subject scores. This method has allowed us to isolate the information bearing portions of the speech. Our most important conclusions to date are:

    1) Across-frequency onset transient portion of the signal is typically the most important.

    2) The spectral regions of these transient are used to code different consonants.

    3) The frequency regions for a given consonant are correlated to the following vowel.

    4) compact spectral-temporal amplitude modulations components (e.g., a 10 Hz modulation) do not seem to play a significant role.

    5) There is some evidence that frequency modulations may play a role, but this remains unproven.

    The above results are complemented with similar studies on hearing impaired ears. Given cochlear damaged, speech scores are greatly reduced, even when audibility is accounted for. The exact reasons for this SNR-loss presently remain unclear, but I speculate that the source of this must be cochlear, and related to nonlinear outer hair cell temporal processing. Specifically, ``edge enhancement'' of the speech signal and forward masking could easily be modified in such ears, leading to SNR-Loss. What ever the reason, it is the key problem that needs to be fully researched.

    Live demos will be played, including ``edge-enhanced'' speech signals, having a greater robustness to noise.

  • Biography

    Dr. Jont Allen received a BS in EE from the University of Illinois in 1966, and PhD from the University of Pennsylvania in 1970. He then joined Bell Laboratories in 1970, where he was in the Acoustics Research Department as a Distinguished member of Technical Staff. From 1996-2002 he worked at AT&T Labs as a Technology Leader. In Aug. of 2003 he join the ECE faculty, University of IL, UIUC.

    Dr. Allen is interested in on cochlear modeling, noninvasive diagnostic testing of cochlear function (such as DPOAE and power reflectance measurements in the ear canal), auditory psychophysics, speech processing for hearing aid applications (noise reduction and multiband compression), speech and music coding (bit-rate reduction) and speech perception (models of loudness and masking). He is presently working on the problem of human speech recognition, with the goal of improving automatic speech recognition robustness in the presences of noise and filtering.




The Center for Language and Speech Processing
The Johns Hopkins University
3400 North Charles Street, Barton Hall
Baltimore, MD 21218
*Telephone: (410) 516-4237 *Fax: (410) 516-5050 *E-mail: clsp@clsp.jhu.edu