PH.D. THESIS TOPIC
Neuro-Computational Basis of Sound Object Recognition
- Developed novel feature extraction methods which are extremely robust to noise for both speech and speaker identification tasks.
- These features capture speech specific regions in the modulation domain which maximize reliability.
- Multi-stream approach to further divide this region into subparts performs better in various noise conditions.
- Successfully demonstrated models for timbre which capture the perceptual space of musical instruments.
- Attentional mechanisms in this perceptual space have been developed which can further boost the representation of any given target object.
- Developed methods to adapt feature extraction and modeling stages to out-of-domain data.
Derived-STRF(Spectro-Temporal Receptive Field) contours for Speech recognition
- Developed an algorithm to learn STRFs from speech data to give sustained response
- Successfully used the resulting contour to derive robust features which show improved performance in noisy conditions
Phoneme recognition framework using STRFs
- Developed a mechanism to automatically select STRF features for each broad phoneme class
- Combined posteriors from multi-layered perceptrons trained on these features give improved performance.
Multi-resolution Analysis for Lung sounds
- Successfully extracted multidimensional features from lung sounds that were able to predict presence of abnormalities.
Speech based filter banks
- Derived filter banks from average spectrum of speech that were compared with perception based mel-like filter banks.
Rock Climbing, Tennis, Racquet Ball, Hiking, Kayaking.
A Phoneme Recognition Framework based on Auditory Spectro-Temporal Receptive Fields
Samuel Thomas, Kailash Patil, Sriram Ganapathy, Nima Mesgarani and Hynek Hermansky
Interspeech – 2010
© Kailash Patil, 2015. All rights reserved.