WS97
Acoustic
Processing
Group
|
 |
In the 1997 JHU/CLSP workshop (WS97) our group revisits the acoustic processor
architecture employed in the state of the art, large vocabulary, continuous speech
recognition systems. We investigate data driven processing paradigms exploring
techniques at different context scales. At the short time scales (~10ms) we investigate
the non-linear frequency mapping known as Mel-scale. At the medium time scales, (context ~
100ms) we investigate linear discriminant and heteorscedastic discriminant transforms. At
time scales with longer context (~1000ms) we explore feature-trajectory filtering.
At even longer time scales (~ 500 ms to 4s) we experiment with adaptive Cepstrum bias
normalization techniques.
The results of our investigation are very encouraging and are summarized in the online
final reports and papers.
Group Members
Reports and Papers
- Final report and summary of results on SWITCHBOARD, A.G. Andreou ( pdf ).
- Learning the Mel-scale and optimal VTL mapping, T. Kamm, H. Hermansky and A.G.
Andreou ( pdf ).
- Cepstrum bias adaptation for the SWITCHBOARD database in unsupervised mode, Y.
Minami ( pdf ).
- Processing of modulation spectrum of speech for ASR of conversational speech,
H. Hermansky, -available from author-.
- WS97 activity report, C. Wellekens ( pdf ).
- Enhanced ASR scores by acoustic feature filtering, C. Wellekens and H.
Hermansky -DRAFT paper- ( postscript ).
- On generalization of linear discriminant analysis JHU/ECE Technical Report
96-07, April 1996, K. Nagendra and A.G. Andreou ( pdf ).
- Heteroscedastic discriminant analysis and reduced rank HMM's for improved speech
recognition, K. Nagendra and A.G. Andreou, Speech Communication,
Vol.
26, pp. 283-297, December 1998 (pdf).
Presentations
- May 1997 LVCSR meeting, A.G. Andreou ( pdf ).
- WS97 final day presentation, A.G. Andreou ( pdf ).
- WS97 final day presentation, Y. Minami ( pdf ).
Please send feedback to Andreas G. Andreou
. This
page was last modified on 11/28/00 07:45 PM