Kailash Patil

Kailash Patil

[email protected]


Neuro-Computational Basis of Sound Object Recognition

  • Developed novel feature extraction methods which are extremely robust to noise for both speech and speaker identification tasks.
    • These features capture speech specific regions in the modulation domain which maximize reliability.
    • Multi-stream approach to further divide this region into subparts performs better in various noise conditions.
  • Successfully demonstrated models for timbre which capture the perceptual space of musical instruments.
    • Attentional mechanisms in this perceptual space have been developed which can further boost the representation of any given target object.
    • Developed methods to adapt feature extraction and modeling stages to out-of-domain data.

Other Projects

Derived-STRF(Spectro-Temporal Receptive Field) contours for Speech recognition

  • Developed an algorithm to learn STRFs from speech data to give sustained response
  • Successfully used the resulting contour to derive robust features which show improved performance in noisy conditions

Phoneme recognition framework using STRFs

  • Developed a mechanism to automatically select STRF features for each broad phoneme class
  • Combined posteriors from multi-layered perceptrons trained on these features give improved performance.

Multi-resolution Analysis for Lung sounds

  • Successfully extracted multidimensional features from lung sounds that were able to predict presence of abnormalities.

Speech based filter banks

  • Derived filter banks from average spectrum of speech that were compared with perception based mel-like filter banks.


Other Interests

Rock Climbing, Tennis, Racquet Ball, Hiking, Kayaking.




A Phoneme Recognition Framework based on Auditory Spectro-Temporal Receptive Fields
Samuel Thomas, Kailash Patil, Sriram Ganapathy, Nima Mesgarani and Hynek Hermansky
Interspeech – 2010

© Kailash Patil, 2015. All rights reserved.

Johns Hopkins University

Johns Hopkins University, Whiting School of Engineering

Center for Language and Speech Processing
Hackerman 226
3400 North Charles Street, Baltimore, MD 21218-2680

Center for Language and Speech Processing