DIMITRA VERGYRI

RESEARCH STATEMENT



I am currently a doctoral candidate in Electrical and Computer Engineering at The Johns Hopkins University and expect to complete my Ph.D. degree in May 2000. For the past five years I have been working with the Center for Language and Speech Processing. My thesis work ``Integration of Multiple Knowledge Sources in Speech Recognition using Minimum Error Training" has been supervised by Prof. Frederick Jelinek. The general area of my research interests spans acoustic and language modeling, speech recognition, statistical modeling, information theory and statistics.

My thesis research addressed the problem of optimal combination of available models with the use of discriminative objective functions.

In the standard formulation of the speech recognition problem, two models are used to score the sentence hypotheses: the acoustic model and the language model. These are developed independently, and combined using a static parameter for scaling the scores of one of the models relative to the other.

In my thesis a general formulation is presented for combining several model scores in a log-linear model that computes the hypothesis likelihood. The model combination can either be performed in a static way, with constant parameters, or in a dynamic way, where the parameters may vary for different segments of a hypothesis. The aim is to optimize the parameters so as to achieve minimum word error rate. In the dynamic combination case, in order to achieve robust parameter estimation, the parameters are defined to be piecewise constant on different classes that form a partition on the space of the hypotheses segments.

The approach is used in three different applications:

Different objective functions, aiming to discriminate between the best available hypothesis and the confusable ones, where employed for training the parameters of the model.

The task of training the model with an objective function matched to the goal of minimum number of errors, is still very much in my research interests, irrespective of whether such a function is used from the beginning to train the parameters of the models, or is used in a Discriminative Model Combination framework to combine a set of trained using Maximum Likelihood techniques.

Within the same framework, I am still interested in pursuing the idea of training (or dynamically modifying) the language model in order to disambiguate among acoustically confusable words. In the application examined in my thesis as well as in similar approaches in the literature, the problem was addressed, but is still far from being solved.

I would also be interested in working on other applications in the general area of statistical pattern recognition. Primarily, I am looking for a position that will allow me to do research in this area at an industrial research laboratory. I am also interested in the idea of joining a product development group that employs statistical methods for solving ``real world'' problems.
 




Dimitra Vergyri

Wed Apr 26 21:42:05 EDT 2000