Publications

Articles in Refereed Journals

SriramGanapathy, Samuel Thomas, and HynekHermansky,”Temporal envelope compensation for robust phoneme recognition using modulation spectrum”, the Journal of the Acoustical Society of America, 2011 in print G. Sivaram, S. Nemala, N. Mesgarani, H. Hermansky, “Data-driven and feedback based spectro-temporal features for speech recognition”, Signal Processing Letters, IEEE, volume 17, issue 11, 2010
Pinto, G. Sivaram, M. Magimai.-Doss, H. Hermansky, and H. Bourlard, Analyzing MLP Based Hierarchical Phoneme Posterior Probability Estimator, IEEE Transactions on Audio, Speech, and Language Processing, 2010
SriramGanapathy, PetrMotlicek and HynekHermansky, Autoregressive Models of Amplitude Modulations in Audio Compression in: IEEE Transactions On Audio, Speech, And Language Processing, 2010
PetrMotlicek, SriramGanapathy, HynekHermansky and HarinathGarudadri, Wide-Band Audio Coding based on Frequency Domain Linear Prediction, in: EURASIP Journal On Audio Speech And Music Processing, Special Issue on, 2009
SriramGanapathy, Samuel Thomas, and HynekHermansky, “Modulation frequency features for phoneme recognition in noisy speech”, the Journal of the Acoustical Society of America, 2009
Samuel Thomas, SriramGanapathy and HynekHermansky, “Recognition Of Reverberant Speech Using Frequency Domain Linear Prediction”, IEEE Signal Processing Letters, 2008.
Morgan, Q. Zhu, A. Stolcke, K. Sonmez, S. Sivadas, T. Shonozaki, M. Ostendorf, P. Jain, H. Hermansky, D. Gelbart, D. Ellis, G. Doddington, B. Chen, O. Cetin, H. Bourlard and M. Athineos, “Pushing the Envelope – Aside : Beyond the Spectral Envelope as the Fundamental Representation for Speech Recognition,” Invited Paper in the IEEE Signal Processing Magazine, 2005.
Hermansky and N. Morgan, “Show What You Know: Musings on the Reporting of Negative Results in Speech Recognition Research,” Invited Editorial Note in Journal of Negative Results in Speech and Audio Sciences, 2004.
H. Yang, S. Sharma, S. van Vuuren, H. Hermansky, “Relevance of Time-Frequency Features for Phonetic and Speaker/Channel Classification,” Speech Communication, August 2000.
Malayath, H. Hermansky, S. Kajarekar and B. Yegnanarayana, “Data-Driven Temporal Filters and Alternatives to GMM in Speaker Verification,” in Digital Signal Processing, Vol. 10, pp 55-74, 2000.
Kanedera, T. Arai, H. Hermansky and M. Pavel, “On the Relative Importance of Various Components of the Modulation Spectrum of Speech,” Speech Communication 28 (1), pp. 43-56, May 1999.
Yegnanarayana, C. Avendano, H. Hermansky and P. S. Murthy, “Speech Enhancement Using Linear Prediction Residual,” Speech Communication 28 (1), pp. 25-42, May 1999.
Arai, M. Pavel, H. Hermansky and C. Avendano, Syllable Intelligibility for Temporally-Filtered LPC Cepstral Trajectories, Journal of the Acoustical Society of America, (105), 5, pp. 2783-2791, May 1999.
Hermansky, “Should recognizers have ears?” in Speech Communication, vol. 25, num. 3-27, 1998.
Bourlard, H. Hermansky and N. Morgan, “Towards Increasing Speech Recognition Error Rates,” invited paper, Speech Communication, Vol 18 (4), May 1996.
Hermansky, “Robust Speech Recognition,” in Cole, Hirshman et al., “The Challenge of Spoken Language Systems, Research Directions for the Nineties,” IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, pp. 1-21, January 1995.
Hermansky and N. Morgan, RASTA Processing of Speech, IEEE Transactions on Speech and Audio Processing, Vol. 2, No. 4, pp. 587-589, October 1994.
C. Junqua, H. Wakita and H. Hermansky, “Optimizing Perceptually Based ASR Front End,” IEEE Transactions on Speech and Audio Processing, Vol. 1, No. 1, pp. 39-49, January 1993.
Hermansky, “Perceptual Linear Predictive (PLP) Analysis of Speech,” Journal of the Acoustical Society of America, Vol. 87, 4, April 1990, pp. 1738-1752.

Book Chapters

Valente and H. Hermansky, “Data-driven extraction of spectral-dynamics based posteriors”, to appear in DARPA GALE book. Joseph Olive (ed.)
Hermansky, “Data-driven extraction of temporal features from speech,” in Dynamics of Speech Production and Perception 5, P. Divenyi et al. (Eds.), IOS Press, 2006
Hermansky, “Speech and its processing,” in Language and Speech Engineering, M. Rajman (Ed.), EPFL Press, 2006.
Avendano, L. Deng, H. Hermansky and B. Gold, “Analysis and Representation of Speech,” in Speech Processing in the Auditory System, Greenberg and Aintsworth (Eds.), Springer 2004.
Morgan, H. Bourlard and H. Hermansky, “Automatic Speech Recognition: an Auditory Perspective,” in Speech Processing in the Auditory System, Greenberg and Aintsworth (Eds.), Springer 2004.
Hermansky and N. Morgan, “Automatic Speech Recognition,” in Encyclopedia of Cognitive Science, L. Nadel (Ed.), Nature Publishing Group, Macmilian Publishers, 2002.
Hermansky, “Modulation Spectrum in Speech Processing,” in Signal Analysis and Prediction, Prochazka, Uhlir, Rayner and Kingsbury (Eds.), Birkhauser Boston, 1998.

Peer Reviewed Papers in Conference Proceedings

Samuel Thomas, Patrick Nguyen, Geoffrey Zweig, Hynek Hermansky, ‘Mlp Based Phoneme Detectors For Automatic Speech Recognition’, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP) 2011, Prague, Czech Republic
Geoffrey Zweig, Patrick Nguyen, Dirk Van Compernolle, Kris Demuynck, Les Atlas, Pascal Clark, Greg Sell, Meihong Wang, Fei Sha, Hynek Hermansky, Damianos Karakos, Aren Jansen, Samuel Thomas, SivaramS.V.S., Sam Bowman, Justine Kao, ‘Speech Recognition With Segmental Conditional Random Fields: A Summary Of The Jhu Clsp 2010 Summer Workshop’, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP) 2011, Prague, Czech Republic
SivaramGarimella and HynekHermansky, Multilayer Perceptron with Sparse Hidden Outputs for Phoneme Recognition, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP) 2011, Prague, Czech Republic
NimaMesgarani, Samuel Thomas, HynekHermansky, A MultistreamMultiresolution Framework for Phoneme Recognition, Proc. INTERSPEECH 2010, Tokyo, Japan, pp. 318 – 321
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky, Cross-Lingual and Multi-Stream Posterior Features for Low Resource LVCSR Systems, Proc. INTERSPEECH 2010, Tokyo, Japan, pp. 877-880
Aren Jansen, Kenneth Church, HynekHermansky, Towards Spoken Term Discovery at Scale with Zero Resources, Proc. INTERSPEECH 2010, Tokyo, pp. 1676 – 1679
S.V.S. Sivaram, Sriram Ganapathy, Hynek Hermansky, Sparse Auto-Associative Neural Networks: Theory and Application to Speech Recognition, Proc. INTERSPEECH 2010, Tokyo, Japan, Page 2270 – 2273
Samuel Thomas, KailashPatil, SriramGanapathy, NimaMesgarani, HynekHermansky, A Phoneme Recognition Framework Based on Auditory Spectro-Temporal Receptive Fields, Proc. INTERSPEECH 2010, Tokyo, Japan, pp. 2458 – 2461
HynekHermansky, History Of Modulation Spectrum In ASR, Invited Paper, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP) 2010, Dallas, Texas
SivaramGarimella, Sridhar Krishna Nemala, MounyaElhilali, Trac Tran, HynekHermansky, Sparse Coding For Speech Recognition, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP) 2010, Dallas, Texas
SriramGanapathy, Samuel Thomas, HynekHermansky, Robust Spectro-Temporal Features Based On Autoregressive Models Of Hilbert Envelopes, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP) 2010, Dallas, Texas
SriramGanapathy, Samuel Thomas, HynekHermansky, Comparison Of Modulation Features For Phoneme Recognition, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP) 2010, Dallas, Texas
Shih-Chii Liu, Nima Mesgarani, John Harris, Hynek Hermansky, The Use of Spike-Based Representations for Hardware Audition System, IEEE International Symposium on Circuits and Systems 2010, Paris, France
Tobi Delbruck, Thomas Koch, Raphal Berner, Hynek Hermansky, Fully Integrated 500uW Speech Detection Wake-Up Circuit, IEEE International Symposium on Circuits and Systems 2010, Paris, France
Kombrink, M. Hanneman, L. Burget, and H. Hermansky, “Recovery of rare words in lecture speech.” in Proceedings Text, Speech and Dialogue 2010 , vol. 2010, no. 9.
Springer Verlag, 2010, pp. 330-337.
PetrMotlicek, SriramGanapathy and HynekHermansky, Arithmetic Coding of Sub-Band Residuals in FDLP Speech/Audio Codec, in: 10th Annual Conference of the International Speech Communication Association, ISCA, Brighton, England, pages 2591-2594, ISCA 2009, 2009
SriramGanapathy, PetrMotlicek and HynekHermansky, Error Resilient Speech Coding Using Sub-band Hilbert Envelopes, in: 12th International Conference on Text, Speech and Dialogue, TSD 2009, Pilsen, Czech Republic, pages 355-362, Springer – Verlag, Berlin Heidelberg 2009, 2009
SriramGanapathy, PetrMotlicek and HynekHermansky, MDCT for Encoding Residual Signals in Frequency Domain Linear Prediction, in: Audio Engineering Society (AES), 127th Convention, Audio Engineering Society (AES), Audio Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA, 2009
Joel Praveen Pinto, G. S. V. S. Sivaram, Hynek Hermansky and Mathew Magimai-Doss, Volterra Series for Analyzing MLP based Phoneme Posterior Probability Estimator, in: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2009
SriramGanapathy, Samuel Thomas, HynekHermansky: Applications of Signal Analysis Using Autoregressive Models for Amplitude Modulation, Proceedings Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2009), 2009, New Paltz, New York, U.S.A.
NimaMesgarani, G.S.V.S. Sivaram, Sridhar K. Nemala, HynekHermansky, “Discriminant Spectrotemporal Features for Phoneme Recognition”, Proc. Interspeech 2009, Brighton, U.K.
Ganapathy, S. Thomas and H. Hermansky “Static and Dynamic Modulation Spectrum for Speech Recognition”, Proc. Interspeech 2009, Brighton, U.K.
Thomas, S. Ganapathy and H. Hermansky, “Tandem Representations of Spectral Envelope and Modulation Frequency Features for ASR”, Proc. Interspeech 2009, Brighton, U.K.
Motlicek, S. Ganapathy and H. Hermansky, “Arithmetic Coding of Sub-band Residuals in FDLP Speech/Audio Codec”, Proc. Interspeech 2009, Brighton, U.K.
Kombrink, L. Burget, P. Matejka, M. Karafiat and H. Hermansky, “Posterior-based Out-of-Vocabulary Word Detection in Telephone Speech”, Proc. Interspeech 2009, Brighton, U.K.
Autoregressive Modelling of Hilbert Envelopes for Wide-band Audio Coding, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky and Harinath Garudadri, in: AES 124th Convention, Audio Engineering Society, 2008
Joel Pinto, SivaramS.V.S, Hynek Hermansky, Mathew Magimai. Doss, ‘Volterra Series For Analyzing MLP Based Phoneme Posterior Estimator’, Proceedings of ICASSP 2009
Samuel Thomas, Sriram Ganapathy, Hynek Hermansky, ‘Phoneme Recognition Using Spectral Envelope And Modulation Frequency Features’, Proceedings of ICASSP 2009
MishaPavel, Malcolm Slaney, HynekHermansky, ‘Reconciliation of human and machine speech recognition performance’, in Proceedings of ICASSP 2009
Daphna Weinshall, Hynek Hermansky, Alon Zweig, Jie Luo, Holly Jimison, Frank Ohl, Misha Pavel, Beyond Novelty Detection: Incongruent Events, when General and Specific Classifiers Disagree, Proceedings Neural Information Processing Conference, Advances in Neural Information Processing 21, MIT Press
Joel Pinto and Hynek Hermansky, Combining Evidence from a Generative and a Discriminative Model in Phoneme Recognition, Proceedings of Interspeech, 2008, Brisbane, Australia
Fabio Valente and Hynek Hermansky, On the Combination of Auditory and Modulation Frequency Channels for ASR applications, Proceedings of Interspeech, 2008, Brisbane, Australia
SriramGanapathy , PetrMotlicek , HynekHermansky , HarinathGarudadri Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain, Proceedings of Interspeech, 2008, Brisbane, Australia
Samuel Thomas, Sriram Ganapathy and Hynek Hermansky, “Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain”, Proceedings of EUSIPCO, 2008
Pinto Joel, Szöke Igor, PrasannaR.M., Hermanský Hynek: Fast Approximate Spoken Term Detection from Sequence of Phonemes, In: The 31st Annual International ACM SIGIR Conference 20-24 July 2008, Singapore, Singapore, SG, ACM, 2008, p. 28-33, ISBN 978-90-365-2697-5
Samuel Thomas, Sriram Ganapathy and Hynek Hermansky, “Hilbert Envelope Based Features for Far- Field Speech Recognition”, Proceedings of MLMI, 2008
SriramGanapathy, Samuel Thomas and HynekHermansky, “Front-end for Far-Field Speech Recognition based on Frequency Domain Linear Prediction”, Proceedings of Interspeech, 2008, Brisbane, Australia
Samuel Thomas, Sriram Ganapathy and Hynek Hermansky, “Hilbert Envelope Based Spectro-Temporal Features for Phoneme Recognition in Telephone Speech”, Proceedings of Interspeech, 2008, Brisbane, Australia
S.V.S. Sivaram and Hynek Hermansky, Emulating Temporal Receptive Fields of Auditory Mid-brain Neurons for Automatic Speech Recognition, 16th European Signal Processing Conference, Lausanne, Switzerland, August 2008
Joel Pinto, G.S.V.S. Sivaram, and Hynek Hermansky, Reverse Correlation for Analyzing MLP Posterior Features in ASR, 11th International Conference on Text, Speech and Dialogue, Brno, Czech Republic (TSD-08).
GarimellaV.S. Sivaram and HynekHermansky: Emulating Temporal Receptive Fields of Higher Level Auditory Neurons for ASR, 11th International Conference on Text, Speech and Dialogue, TSD 2008, Brno, Czech Republic, September 8–12 2008,
S.V.S. Sivaram and Hynek Hermansky Introducing Temporal Asymmetries in Feature Extraction for Automatic Speech Recognition, Interspeech 2008, Brisbane, Australia.
SreeHari Krishnan Parthasarathi, PetrMotlicek, and HynekHermansky, “Exploiting contextual information for speech/non-speech detection ”, Text, Speech and Dialogue, 2008.
SriramGanapathy, PetrMotlicek, HynekHermansky, HarinathGarudadri, Autoregressive Modeling of Hilbert Envelopes for Wide-Band Audio Coding. 124th Convention of Audioengineering Society, Amsterdam, 2008
Lukas Burget, Petr Schwarz, Pavel Matejka, Mirko Hannemann, Ariya Rastrow, Christopher White, Sanjeev Khudanpur, Hynek Hermansky, Jan Cernocky, ‘Combination Of Strongly And Weakly Constrained Recognizers For Reliable Detection Of Out-Of-Vocabulary Words (OOVs)’, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)”, 2008
SriramGanapathy, PetrMotlicek, HynekHermansky, HarinathGarudadri, ‘Temporal Masking For Bit-Rate Reduction In Audio Codec Based On Frequency Domain Linear Prediction’, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)”, 2008
Joel Pinto, Yegnanarayana Bayya, Hynek Hermansky, Mathew Magimai-Doss, ‘Exploiting Contextual Information For Improved Phoneme Recognition’, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)”, 2008
Fabio Valente, Hynek Hermansky, ‘Hierarchical and Parallel Processing of Modulation Spectrum for ASR applications’, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)”, 2008
Valente and H. Hermansky, “Multi-stream Features Combination based on Dempster-Shafer Rule for LVCSR System”, in Proceedings of the International Conference on Spoken Language Processing, Antwerp, Belgium 2007.
Christopher White, Geoffrey Zweig, Lukas Burget, Petr Schwarz, Hynek Hermansky, ‘Confidence Estimation, Oov Detection And Language Id Using Phone-To-Word Transduction And Phone-Level Alignments’, Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)”, 2008
Valente and H. Hermansky, Combination of Acoustic Classifiers based on Dempster-Shafer Theory of evidence, in Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)”, 2007.
Valente, Fabio, Vepa, Jithendra, Plahl, Christian, Gollan, Christian, Hermansky, Hynek, Schlüter, Ralf , “Hierarchical neural networks feature extraction for LVCSR system”, in Proceedings of Interspeech-2007, 42-45.
Motlicek, V. Ullal, H. Hermansky, Wide-Band Perceptual Audio Coding based on Frequency-Domain Linear Prediction, in Proceedings of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP)”, 2007
Prasanna and HynekHermansky, “MRASTA and PLP in Automatic Speech Recognition”, in Proceedings of the International Conference on Spoken Language Processing, Antwerp, Belgium 2007.
Valente and H. Hermansky, “Hierarchical Neural Networks Feature Extraction for LVCSR System”, in Proceedings of the International Conference on Spoken Language Processing, Antwerp, Belgium 2007.
Pinto, A. Lovitt and H. Hermansky, “Exploiting Phoneme Similarities in Hybrid HMM-ANN Keyword Spotting”, in Proceedings of the International Conference on Spoken Language Processing, Antwerp, Belgium 2007.
Ketabdar, M. Hannemann, and H. Hermansky, “Detection of Out-of-Vocabulary Words in Posterior Based ASR”, in Proceedings of the International Conference on Spoken Language Processing, Antwerp, Belgium 2007.
Valente and H. Hermansky, “Combination of Acoustic Classifiers based on Dempster-Shafer Theory of Evidence,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Honolulu, HI, 2007.
Motlicek, V. Ullal and H. Hermansky, “Wide-Band Perceptual Audio Coding based on Frequency-Domain Linear Prediction,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Honolulu, HI, 2007.
Valente and H. Hermansky, “Discriminant Linear Processing of Time-Frequency Plane,” in Proceedings of the International Conference on Spoken Language Processing, Pittsburgh, PA, 2006.
Fousek and H. Hermansky, “Towards ASR Based on Hierarchical Posterior-Based Keyword Recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, France, 2006
Motlicek, H. Hermansky, H. Garudadri and N. Srinivasamurthy, “Speech Coding based on Spectral Dynamics,” in Proceedings of the International Conference on Text, Speech and Dialogue, Brno, Czech Republic, 2006.
Hermansky, P. Fousek and M. Lehtonen, “The Role of Speech in Multimodal Human-Computer Interaction (Towards Reliable Rejection of Non-Keyword Input),” in Proceedings of the International Conference on Text, Speech and Dialogue, Carlsbad, Czech Republic, 2005.
Hermansky and P. Fousek, “Multi-resolution RASTA filtering for TANDEM-based ASR,” in Proceedings of the European Conference on Speech Communication and Technology, Lisbon, Portugal, 2005.
Athineos, H. Hermansky, and D. P. W. Ellis, “LP-TRAP: Linear Predictive Temporal Patterns,” Proc. ICSLP, pp. 1154-1157, Jeju, S. Korea, October 2004
Athineos, H. Hermansky and D. Ellis (2004) PLP^2: Autoregressive modeling of auditory-like 2-D spectro-temporal patterns: Proceedings ISCA Tutorial and Research Workshop (ITRW) on Statistical and Perceptual Audio Processing (SAPA) ICC Jeju, Korea, October 3, 2004
Fousek, P. Svojanovsky, F. Grezl and H. Hermansky, “New Nonsense Syllables Database – Analyses and Preliminary ASR Experiments,” in Proceedings of the International Conference on Spoken Language Processing, Jeju Island, Korea, 2004.
Sivadas and H. Hermansky, “On use of task-independent data in TANDEM feature extraction,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, 2004.
Ikbal, H. Misra, H. Bourlard and H. Hermansky, “Phase AutoCorrelation (PAC) features in Entropy based Multi-Stream for Robust Speech Recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, 2004.
Misra, S. Ikbal, H. Bourlard and H. Hermansky, “Spectral Entropy Based Feature for Robust ASR,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, 2004.
Hermansky and H. Bourlard, “Some Emerging Concepts in Speech Recognition,” Invited Paper in Proceedings of Lectures by Masters in Speech Processing, Maui, HI, 2004
Hermansky, “Data-Guided Processing of Speech,” Invited Paper in Proceedings of the International Workshop on Speech and Computer, Moscow, Russia, 2003.
Hermansky, “Data Guided Processing of Speech,” Invited Paper in Proceedings of the Indian Workshop on Spoken Language Processing, Tata Institute of Fundamental Research (TIFR), Mumbai, India, 2003.
Hermansky, “TRAP-TANDEM: Data-driven extraction of temporal features from speech,” in Proceedings of the IEEE Workshop in Automatic Speech Recognition and Understanding, U.S. Virgin Islands, 2003
Adami, L. Burget, S. Dupont, H. Garudadri, F. Grezl, H. Hermansky, P. Jain, S. Kajarekar, N. Morgan and S. Sivadas, “QUALCOMM-ICSI-OGI Features for ASR,” in Proceedings of the International Conference on Spoken Language Processing, Denver, CO, 2002.
Sivadas and H. Hermansky, “Hierarchical Tandem Feature Extraction,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Tampa, FL, 2002.
Adami, S. Kajarekar and H. Hermansky, “New Speaker-Change Detection Algorithm for Two-Speaker Segmentation,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Tampa, FL, 2002.
Kajarekar, B. Yegnanarayana and H. Hermansky, “A Study of Two Dimensional Linear Discriminants for ASR,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Salt Lake City, UT, 2001.
Benitez, L. Burget, B. Chen, S. Dupont, H. Garudadri, H. Hermansky, P. Jain, S. Kajarekar, N. Morgan and S. Sivadas, “Robust ASR Front-End Using Spectral-Based and Discriminant Features: Experiments on the AURORA Tasks,” in Proceedings of the European Conference on Speech Communication and Technology, Aalborg, Denmark, 2001.
Hermansky, “Human Speech Perception: Some Lessons from Automatic Speech Recognition,” in Proceedings of the International Conference on Text, Speech and Dialogue, ZeleznaRuda, Czech Republic, September 2001.
Burget and H. Hermansky, “Data Driven Design of Filter Bank for Speech Recognition,” in Proceedings of the International Conference on Text, Speech and Dialogue, ZeleznaRuda, Czech Republic, September 2001.
Kajarekar and H. Hermansky, “Analysis of Information in Speech and Its Application in Speech Recognition,” in Proceedings of the International Conference on Text, Speech and Dialogue, Brno, Czech Republic, 2000.
Sharma, D. Ellis, S. Kajarekar, P. Jain and H. Hermansky, “Feature Extraction Using Non-linear Transformation for Robust Speech Recognition on the AURORA Data-Base,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Istanbul, Turkey, 2000.
Hermansky, D. Ellis and S. Sharma, “Connectionist Feature Extraction for Conventional HMM Systems,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Istanbul, Turkey, 2000.
Kajarekar, N. Malayath and H. Hermansky, “ANOVA in Modulation Spectral Domain,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Istanbul, Turkey, 2000.
Sivadas, P. Jain and H. Hermansky, “Discriminative MLPs in HMM-Based Recognition of Speech in Cellular Telephony,” in Proceedings of the International Conference on Spoken Language Processing, Beijing, China, 2000.
Kajarekar and H. Hermansky, “Optimization of Units for Continuous-Digit Recognition Task,” in Proceedings of the International Conference on Spoken Language Processing, Beijing, China, 2000.
Jain and H. Hermansky, “Temporal Patterns of Critical-Band Spectrum for Text-to-Speech,” in Proceedings of the International Conference on Spoken Language Processing, Beijing, China, 2000.
Hermansky, S. Sharma and P. Jain, “Data-derived Nonlinear Mapping For Feature Extraction in HMM,” in Proceedings of the IEEE Workshop in Automatic Speech Recognition and Understanding, Keystone, CO, 1999.
Kajarekar, N. Malayath and H. Hermansky, “Analysis of Speaker and Channel Variability in Speech,” in Proceedings of the IEEE Workshop in Automatic Speech Recognition and Understanding, Keystone, CO, 1999.
Hermansky and P. Jain, “Down-Sampling Speech Representation in ASR,” in Proceedings of the European Conference on Speech Communication and Technology, Budapest, Hungary, 1999.
Kajarekar, N. Malayath and H. Hermansky, “Analysis of Source of Variability in Speech,” in Proceedings of the European Conference on Speech Communication and Technology, Budapest, Hungary, 1999.
Yang, S. van Vuuren and H. Hermansky, “Relevancy of Time-Frequency Features for Phonetic Classification Measured by Mutual Information,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Phoenix, AZ, 1999.
Hermansky and S. Sharma, “Temporal Patterns (TRAPS) in ASR of Noisy Speech,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Phoenix, AZ, 1999.
Hermansky, “Mel Cepstrum, Deltas, Double-Deltas, … – What Else Is New?” Invited Paper in Proceedings of the Workshop on Robust Methods for Speech Recognition in Adverse Conditions, Tampere, Finland, 1999.
van Vuuren and H. Hermansky, “On the Importance of Components of the Modulation Spectrum for Speaker Verification,” in Proceedings of the International Conference on Spoken Language Processing, Sydney, Australia, 1998.
van Vuuren and H. Hermansky, “Speaker Recognition in a Time-Feature Space,” Proceedings of the NIST Speaker Recognition Workshop, Gaithersburg, MD, 1998.
Hermansky, “Data-Driven Analysis of Speech,” Invited Paper, in Proceedings of the International Conference on Text, Speech and Dialogue, Brno, Czech Republic, 1998.
Hermansky and N. Malayath, “Speaker Verification Using Speaker-Specific Mappings,” in Proceedings of the Workshop on Speaker Recognition and its Commercial and Forensic Applications, Avignon, France, 1998.
van Vuuren and H. Hermansky, “MESS: A Modular Efficient Speaker Verification System,” in Proceedings of the Workshop on Speaker Recognition and its Commercial and Forensic Applications, Avignon, France, 1998.
Sharma, P. Vermeulen and H. Hermansky, “Combining Information from Multiple Classifiers for Speaker Verification,” in Proceedings of the Workshop on Speaker Recognition and its Commercial and Forensic Applications, Avignon, France, 1998.
Hermansky and S. Sharma, “TRAPS – Classifiers of Temporal Patterns,” in Proceedings of the International Conference on Spoken Language Processing, Sydney, Australia, 1998.
Hermansky and N. Malayath, “Spectral Basis Functions from Discriminant Analysis,” in Proceedings of the International Conference on Spoken Language Processing, Sydney, Australia, 1998.
Yegnanarayana, P. S. Murthy, C. Avendano and H. Hermansky, “Enhancement of Reverberant Speech Using LP Residual,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Seattle, WA, 1998.
Kanedera, H. Hermansky and T. Arai, “Desired Characteristics of Modulation Spectrum for Robust Automatic Speech Recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Seattle, WA, 1998.
Hermansky, “The Modulation Spectrum in Automatic Recognition of Speech,” in Proceedings of the IEEE Workshop in Automatic Speech Recognition and Understanding, Santa Barbara, CA, 1997.
Malayath, H. Hermansky and A. Kain, “Towards Decomposing the Sources of Variability in Speech,” in Proceedings of the European Conference on Speech Communication and Technology, Rhodos, Greece, 1997.
Hermansky, “Speculations on Knowledge Versus Data in ASR,” in Proceedings of the ESCA-NATO Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-a-Mousson, France, 1997.
Hermansky, C. Avendano, S. van Vuuren and S. Tibrewala, “Recent Advances in Addressing Sources of Non-Linguistic Information,” in Proceedings of ESCA-NATO Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-a-Mousson, France, 1997.
Hermansky, “Auditory Modeling in Automatic Recognition of Speech,” Survey Keynote Paper in Proceedings of The First European Conference on Signal Analysis and Prediction, Prague, Czech Republic, 1997.
Pavel and H. Hermansky, “Information Fusion by Human and Machine,” Proceedings of the First European Conference on Signal Analysis and Prediction, Prague, Czech Republic, 1997.
Avendano and H. Hermansky, “On the Properties of Temporal Processing for Speech in Adverse Environments,” Proceedings of The IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Mohonk Mountain House, New Paltz, NY, 1997.
Hermansky, “Should Recognizers Have Ears?” invited tutorial paper in Proceedings of the ESCA-NATO Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-a-Mousson, France, 1997
Tibrewala and H. Hermansky, “Multi-band and Adaptation Approaches to Robust Speech Recognition,” in Proceedings of the European Conference on Speech Communication and Technology, Rhodos, Greece, 1997.
van Vuuren and H. Hermansky, “Data-Driven Design of RASTA-Like Filters,” in Proceedings of the European Conference on Speech Communication and Technology, Rhodos, Greece, 1997.
Yegnanarayana, C. Avendano, H. Hermansky and P. S. Murthy, “Processing Linear Prediction Residual for Speech Enhancement,” in Proceedings of the European Conference on Speech Communication and Technology, Rhodos, Greece, 1997.
Kanedera, T. Arai, H. Hermansky and M. Pavel, “On the Importance of Various Modulation Frequencies for Speech Recognition,” in Proceedings of the European Conference on Speech Communication and Technology, Rhodos, Greece, 1997.
Tibrewala and H. Hermansky, “Sub-band Based Recognition of Noisy Speech,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Munich, Germany, 1997.
Bourlard, H. Hermansky and N. Morgan, “Copernicus and ASR Challenge: Waiting for Kepler,” invited keynote paper in Proceedings of Advanced Research Projects Agency Workshop on Automatic Speech Recognition, Harriman, NY, 1996.
Hermansky, S. Tibrewala and M. Pavel, “Towards ASR on Partially Corrupted Speech,” in Proceedings of the International Conference on Spoken Language Processing, Philadelphia, PA, 1996.
Avendano, H. Hermansky, M. Vis and A. Bayya, “Adaptive Speech Enhancement System Based on Frequency-Specific SNR Estimation,” in Proceedings of the Third International Workshop on Interactive Voice Technology for Telecommunications Applications, Baskin Ridge, NJ, 1996.
Avendano, S. van Vuuren and H. Hermansky, “Data Based Filter Design for RASTA-Like Channel Normalization in ASR,” in Proceedings of the International Conference on Spoken Language Processing, Philadelphia, PA, 1996.
Arai, M. Pavel, H. Hermansky and C. Avendano, “Intelligibility of Speech with Filtered Time Trajectories of Spectral Envelopes,” in Proceedings of the International Conference on Spoken Language Processing, Philadelphia, PA, 1996.
Avendano and H. Hermansky, “Study on the Dereverberation of Speech Based on Temporal Envelope Filtering, ” in Proceedings of the International Conference on Spoken Language Processing, Philadelphia, PA, 1996.
Hermansky, S. Greenberg and M. Pavel, “A Brief (100-200 ms) History of Time in Feature Extraction of Speech,” in Proceedings of the XV Annual Speech Research Symposium, Baltimore, MD, 1995.
Hermansky and M. Pavel, “Psychophysics of Speech Engineering Systems,” invited paper in Proceedings of the 13th International Congress on Phonetic Sciences, Stockholm, Sweden, 1995.
Hermansky, “Exploring Temporal Domain for Robustness in Speech Recognition,” invited paper in Proceedings of the 15th International Congress on Acoustics, Trondheim, Norway, 1995.
Hermansky, C. Avendano and E. Wan, “Noise Reduction and Recovery of Missing Frequencies in Speech,” in Proceedings of the XV Annual Speech Research Symposium, Baltimore, MD, 1995.
Avendano, H. Hermansky and E. Wan, “Beyond Nyquist: Recovery of Wide-band Speech from Narrow-band Speech,” in Proceedings of the European Conference on Speech Communication and Technology, Madrid, Spain, 1995.
Hermansky, E. Wan and C. Avendano, “Speech Enhancement Based on Temporal Processing,” Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Detroit, MI, 1995.
Morgan, H. Bourlard, S. Greenberg and H. Hermansky, “Stochastic Perceptual Models of Speech,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Detroit, MI, 1995.
Morgan, H. Bourlard, S. Greenberg and H. Hermansky, “Stochastic Perceptual Auditory-Event-Based Models for Speech Recognition,” in Proceedings of the International Conference on Spoken Language Processing, Yokohama, Japan, 1994.
Koehler, N. Morgan, H. Hermansky, H. G. Hirsch and G. Tong, “Integrating RASTA-PLP Into Speech Recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Adelaide, Australia, 1994.
Hermansky, E. A. Wan and C. Avendano, “Noise Suppression in Cellular Communications,” in Proceedings of the Second International Workshop on Interactive Voice Technology for Telecommunications Applications, Kyoto, Japan, 1994.
Hermansky, “Speech beyond 20 ms: Speech Processing in Temporal Domain,” invited keynote lecture in Proceedings of the International Workshop on Human Interface Technology, Aizu, Japan, 1994.
Hermansky, N. Morgan and H. G. Hirsch, “Recognition of Speech in Additive and Convolutional Noise Based On RASTA Spectral Processing,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Minneapolis, MN, 1993.
Hermansky, N. Morgan, A. Bayya and P. Kohn, “RASTA-PLP Speech Analysis Technique,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, San Francisco, CA, 1992.
Hermansky, “Recognition of Information-bearing Elements in Speech”, Journal of the Acoustical Society of America, Vol. 114, No 5, Pt. 2, 2003.
Hermansky and N. Morgan, “Towards Handling Acoustic Environment in Spoken Language Processing,” Proc. International Conference on Speech and Language Processing, Banff, Canada, 1992.
Hermansky, “Ignorance-Based Verification of Some Concepts in Hearing,” in Proceedings of the Workshop on Speech Research, Indian Institute of Technology, Madras, India, 1992
Morgan and H. Hermansky, “RASTA Extensions, Robustness to Additive and Convolutive Noise,” in Proceedings of the Workshop on Speech Processing in Adverse Environments, Cannes, France, 1992.
Hermansky, N. Morgan, A. Bayya and P. Kohn, “The Challenge of Inverse-E,” Proceedings of the IEEE Asilomar Conference on Signal, Systems and Computers, Asilomar, CA, 1991.
Hermansky, N. Morgan, A. Bayya and P. Kohn, “Compensation for the Effect of the Communication Channel in Auditory-Like Analysis of Speech,” in Proceedings of the European Conference on Speech Communication and Technology, Genova, Italy, 1991.
Hermansky, “In Search of Linguistic Information in Speech,” Proceedings of the Arden House Speech Recognition Workshop, Harriman, NY, 1991.
Hermansky and A. L. Cox Jr., “Perceptual Linear Predictive (PLP) Analysis-Resynthesis Technique,” in Proceedings of the European Conference on Speech Communication and Technology, Genova, Italy, 1991.
Morgan, H. Hermansky, H. Bourlard, P. Kohn and C. Wooters, “Continuous Speech Recognition Using PLP Analysis with Multilayer Perceptrons,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, Canada, 1991.
Bayya and H. Hermansky, “Towards Feature-Based Speech Metric,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Albuquerque, NM, 1990.
Hermansky and D. Broad, “The Effective Second Formant F2′ and the Vocal Tract Front Cavity,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Glasgow, Scotland, 1989.
Hermansky and J. C. Junqua, “Optimization of Perceptually Based ASR Front End,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, New York, NY, 1988.
Hermansky, “Automatic Speech Recognition and Human Auditory Perception,” in Proceedings of the European Conference on Speech Communication and Technology, Edinburgh, Scotland, 1987.
Hermansky, “Role of Relative Positions of Formant and Harmonic Peaks in Perception of Vowel-Like Stimuli,” STL Technical Reports, Santa Barbara, CA, 1987.
Hermansky, “An Efficient Speaker-Independent Automatic Speech Recognition By Simulation of Some Properties of Human Auditory Processing,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, 1987.
Javkin, H. Hermansky and H. Wakita, “Interaction Between Formant and Harmonic Peaks in Vowel Perception,” Proceedings of Eleventh International Congress of Phonetic Sciences, Tallin, Estonia, 1987.
Hermansky, K. Tsuga, S. Makino and H. Wakita, “Perceptually Based Processing in Automatic Speech Recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Tokyo, Japan, 1986.
Hermansky, B. A. Hanson and H. Wakita, “Perceptually Based Linear Predictive Analysis of Speech,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Tampa, FL, 1985.
Hermansky, B. A. Hanson, H. Wakita and H. Fujisaki, “Linear Predictive Modeling of Speech in Modified Spectral Domains,” Proceedings of International Conference on Digital Processing of Signals in Communications, Institution of Electronic and Radio Engineers, Loughbrorough, England, 1985.
Hermansky, H. Fujisaki and Y. Sato, “Spectral Envelope Sampling and Interpolation in Linear Predictive Analysis of Speech,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, San Diego, CA, 1984.
Hermansky, H. Fujisaki and Y. Sato, “Analysis and Synthesis of Speech Based on Spectral Transform Linear Predictive Method,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Boston, MA, 1983.
Hermansky, H. Fujisaki and Y. Sato, “Spectral Transform Linear Predictive Analysis for Analysis-Synthesis of Speech,” Proceedings of 11e Congress International d’Acoustique, Paris, France, 1983.
Hermansky, H. Fujisaki and Y. Sato, “Speech Analysis-Synthesis System Based on STLP Method,” Proceedings of the Spring Meeting of the Acoustical Society of Japan, Tokyo, 1983.
Hermansky, H. Fujisaki and Y. Sato, “On Analysis of Real Speech Based on STLP Method,” Proceedings of the Spring Meeting of the Acoustical Society of Japan, Nagaoka, Japan, 1982.
Hermansky, H. Fujisaki and Y. Sato, “Spectral Envelope Sampling in Linear Predictive Analysis of Speech,” Proceedings of the Spring Meeting of the Acoustical Society of Japan, Tokyo, Japan, 1982.
Hermansky and H. Fujisaki, “Spectral Transforms in Linear Prediction,” Proceedings of the Annual Meeting of the Institute of Electronics and Communication Engineers of Japan, Tokyo, Japan, 1982.
Hermansky, “Improved Linear Predictive Analysis of Speech Based on Spectral Processing,” Dr. Eng. Thesis, University of Tokyo, Tokyo, Japan, 12.
Hermansky, H. Fujisaki and Y. Sato, “Sampling and Interpolation of Spectral Envelopes in Linear Predictive Analysis of Speech,” Transactions of the Committee on Speech Research, Acoustical Society of Japan, 1982, pp. 97-104.
Hermansky and H. Fujisaki, “Spectral Transforms in Linear Predictive Analysis,” Proceedings of the Fall Meeting of the Acoustical Society of Japan, Kagoshima, Japan, 1981.
Hermansky and H. Fujisaki, “LPC Methods in the Short Time Analysis of Voiced Speech,” Proceedings of the Spring Meeting of the Acoustical Society of Japan, Tokyo, 1981.
Hermansky and H. Fujisaki, “The Effect of Spectral Transforms in Linear Predictive Analysis of Speech,” Transactions of the Committee on Speech Research, Acoustical Society of Japan, 1981, pp. 365-372.
Hermansky and H. Fujisaki, “Acoustic Characteristics of Czech Vowels, Research on Human Information Processing,” in Proceedings of the Fall Meeting of the Acoustical Society of Japan, Shimizu, Japan, 1980.

Jul

Wed

Publications

Upcoming Seminars

Center for Language and Speech Processing