Publications


Loading...

2014

Using Comparable Corpora to Adapt MT Models to New Domains
Ann Irvine and Chris Callison-Burch
Proceedings of the ACL Workshop on Statistical Machine Translation (WMT) – 2014

[abstract] [bib]

Abstract

In previous work we showed that when using an SMT model trained on old-domain data to translate text in a new-domain, most errors are due to unseen source words, unseen target translations, and inaccurate translation model scores (Irvine et al., 2013a). In this work, we target errors due to inaccurate translation model scores using new-domain comparable corpora, which we mine from Wikipedia. We assume that we have access to a large old-domain parallel training corpus but only enough new-domain parallel data to tune model parameters and do evaluation. We use the new-domain comparable corpora to estimate additional feature scores over the phrase pairs in our baseline models. Augmenting models with the new features improves the quality of machine translations in the medical and science domains by up to 1.3 BLEU points over very strong baselines trained on the 150 million word Canadian Hansard dataset.
@InProceedings{irvine-callisonburch-wmt14,
author = {Irvine, Ann and Callison-Burch, Chris},
title = {Using Comparable Corpora to Adapt MT Models to New Domains},
booktitle = {Proceedings of the ACL Workshop on Statistical Machine Translation (WMT)},
url = {http://www.cs.jhu.edu/~anni/papers/irvineCCB_wmt14.pdf}
}

Hallucinating Phrase Translations for Low Resource MT
Ann Irvine and Chris Callison-Burch
Proceedings of the Conference on Computational Natural Language Learning (CoNLL) – 2014

[abstract] [bib]

Abstract

We demonstrate that "hallucinating" phrasal translations can significantly improve the quality of machine translation in low resource conditions. Our hallucinated phrase tables consist of entries composed from multiple unigram translations drawn from the baseline phrase table and from translations that are induced from monolingual corpora. The hallucinated phrase table is very noisy. Its translations are low precision but high recall. We counter this by introducing 30 new feature functions (including a variety of monolingually-estimated features) and by aggressively pruning the phrase table. Our analysis evaluates the intrinsic quality of our hallucinated phrase pairs as well as their impact in end-to-end Spanish-English and Hindi-English MT.
@InProceedings{irvine-callisonburch-conll14,
author = {Irvine, Ann and Callison-Burch, Chris},
title = {Hallucinating Phrase Translations for Low Resource MT},
booktitle = {Proceedings of the Conference on Computational Natural Language Learning (CoNLL)},
url = {http://www.cs.jhu.edu/~anni/papers/irvineCCB_Hallucinating_CoNLL_14.pdf}
}

Some Insights from Translating Conversational Telephone Speech
Gaurav Kumar, Matt Post, Daniel Povey and Sanjeev Khudanpur
Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP) – 2014

[abstract] [bib]

Abstract

We report insights from translating Spanish conversational telephone speech into English text by cascading an automatic speech recognition (ASR) system with a statistical machine translation (SMT) system. The key new insight is that the informal register of conversational speech is a greater challenge for ASR than for SMT: the BLEU score for translating the reference transcript is 64%, but drops to 32% for translating automatic transcripts, whose word error rate (WER) is 40%. Several strategies are examined to mitigate the impact of ASR errors on the SMT output: (i) providing the ASR lattice, instead of the 1-best output, as input to the SMT system, (ii) training the SMT system on Spanish ASR output paired with English text, instead of Spanish reference transcripts, and (iii) improving the core ASR system. Each leads to consistent and complementary improvements in the SMT output. Compared to translating the 1-best output of an ASR system with 40% WER using an SMT system trained on Spanish reference tran- scripts, translating the output lattice of a better ASR system with 35% WER using an SMT system trained on ASR output improves BLEU from 32% to 38%.
@InProceedings{kumar-some-2014,
author = {Kumar, Gaurav and Post, Matt and Povey, Daniel and Khudanpur, Sanjeev},
title = {Some Insights from Translating Conversational Telephone Speech},
booktitle = {Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
address = {Florence, Italy},
url = {http://cs.jhu.edu/~gkumar/papers/kumar2014some.pdf}
}

The American Local News Corpus
Ann Irvine, Joshua Langfus and Chris Callison-Burch
Proceedings of the Language Resources and Evaluation Conference (LREC) – 2014

[abstract] [bib]

Abstract

We present the American Local News Corpus (ALNC), containing over 4 billion words of text from 2,652 online newspapers in the United States. Each article in the corpus is associated with a timestamp, state, and city. All 50 U.S. states and 1,924 cities are represented. We detail our method for taking daily snapshots of thousands of local and national newspapers and present two example corpus analyses. The first explores how different sports are talked about over time and geography. The second compares per capita murder rates with news coverage of murders across the 50 states. The ALNC is about the same size as the Gigaword corpus and is growing continuously. Version 1.0 is available for research use.
@InProceedings{irvine-etal-lrec14,
author = {Irvine, Ann and Joshua Langfus and Callison-Burch, Chris},
title = {The American Local News Corpus},
booktitle = {Proceedings of the Language Resources and Evaluation Conference (LREC)},
url = {http://www.cs.jhu.edu/~anni/papers/alnc_lrec14.pdf}
}

The Language Demographics of Amazon Mechanical Turk
Ellie Pavlick, Matt Post, Ann Irvine, Dmitry Kachaev and Chris Callison-Burch
Transactions of the Association for Computational Linguistics (TACL) – 2014

[abstract] [bib]

Abstract

We present a large scale study of the languages spoken by bilingual workers on Mechanical Turk (MTurk). We establish a methodology for determining the language skills of anonymous crowd workers that is more robust than simple surveying. We validate workers' self-reported language skill claims by measuring their ability to correctly translate words, and by geolocating workers to see if they reside in countries where the languages are likely to be spoken. Rather than posting a one-off survey, we posted paid tasks consisting of 1,000 assignments to translate a total of 10,000 words in each of 100 languages. Our study ran for several months, and was highly visible on the MTurk crowdsourcing platform, increasing the chances that bilingual workers would complete it. Our study was useful both to create bilingual dictionaries and to act as census of the bilingual speakers on MTurk. We use this data to recommend languages with the largest speaker populations as good candidates for other researchers who want to develop crowdsourced, multilingual technologies. To further demonstrate the value of creating data via crowdsourcing, we hire workers to create bilingual parallel corpora in six Indian languages, and use them to train statistical machine translation systems.
@article{Pavlick-EtAl-2014,
author = {Ellie Pavlick and Post, Matt and Irvine, Ann and Dmitry Kachaev and Callison-Burch, Chris},
title = {The Language Demographics of Amazon Mechanical Turk},
booktitle = {Transactions of the Association for Computational Linguistics (TACL)},
publisher = {Association for Computational Linguistics},
url = {http://cs.jhu.edu/~ccb/publications/language-demographics-of-mechanical-turk.pdf}
}

improving deep neural network acoustic models using generalized maxout networks
Xiaohui Zhang, Jan Trmal, Daniel Povey and Sanjeev Khudanpur
ICASSP2014 – 2014

[abstract] [bib]

Abstract

Recently, maxout networks have brought significant improvements to various speech recognition and computer vision tasks. In this pa- per we introduce two new types of generalized maxout units, which we call p-norm and soft-maxout. We investigate their performance in Large Vocabulary Continuous Speech Recognition (LVCSR) tasks in various languages with 10 hours and 60 hours of data, and find that the p-norm generalization of maxout consistently performs well. Because, in our training setup, we sometimes see instability dur- ing training when training unbounded-output nonlinearities such as these, we also present a method to control that instability. This is the "normalization layer", which is a nonlinearity that scales down all dimensions of its input in order to stop the average squared output from exceeding one. The performance of our proposed nonlinear- ities are compared with maxout, rectified linear units (ReLU), tanh units, and also with a discriminatively trained SGMM/HMM system, and our p-norm units with p equal to 2 are found to perform best.
@inproceedings{Zhan1405:Improving,
author = {Xiaohui Zhang and Jan Trmal and Povey, Daniel and Khudanpur, Sanjeev},
title = {improving deep neural network acoustic models using generalized maxout networks},
booktitle = {ICASSP2014},
address = {Florence, Italy},
url = {http://www.danielpovey.com/files/2014_icassp_dnn.pdf}
}

An Algerian Arabic-French Code-Switched Corpus
Ryan Cotterell, Adithya Renduchintala, Naomi P. Saphra and Chris Callison-Burch
LREC-2014 Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools – 2014

[bib]

@article{cotterell_algerian_????,
author = {Cotterell, Ryan and Renduchintala, Adithya and Saphra, Naomi and Callison-Burch, Chris},
title = {An Algerian Arabic-French Code-Switched Corpus},
booktitle = {LREC-2014 Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools},
url = {http://cis.upenn.edu/~ccb/publications/arabic-french-codeswitching.pdf}
}

Understanding Objects in Detail with Fine-grained Attributes.
A. Vedaldi, S. Mahendran, S. Tsogkas, S. Maji, B. Girshick, J. Kannala, E. Rahtu, I. Kokkinos, M. B. Blaschko, D. Weiss, B. Taskar, K. Simonyan, Naomi P. Saphra and S. Mohamed
Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) – 2014

[bib]

@inproceedings{a_vedaldi_understanding_2014,
author = {A. Vedaldi and S. Mahendran and S. Tsogkas and S. Maji and B. Girshick and J. Kannala and E. Rahtu and I. Kokkinos and M. B. Blaschko and D. Weiss and B. Taskar and K. Simonyan and Saphra, Naomi and S. Mohamed},
title = {Understanding Objects in Detail with Fine-grained Attributes.},
booktitle = {Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)}
}

Principal Components of Auditory Spectro-Temporal Receptive Fields
Nagraj Mahajan, Nima Mesgarani and Hynek Hermansky
Proc. of INTERSPEECH – 2014

[bib]

@inproceedings{mahajan-mesgarani-hermansky:is2014a,
author = {Mahajan, Nagraj and Nima Mesgarani and Hermansky, Hynek},
title = {Principal Components of Auditory Spectro-Temporal Receptive Fields},
booktitle = {Proc. of INTERSPEECH}
}

Evaluating speech features with the Minimal-Pair ABX task (II): Resistance to noise
Thomas Schatz, Vijayaditya Peddinti, Yuan Cao, Francis Bach, Hynek Hermansky and Emmanuel Dupoux
Proc. of INTERSPEECH – 2014

[bib]

@inproceedings{schatz-peddinti-cao-bach-hermansky-dupoux:is2014c,
author = {Thomas Schatz and Peddinti, Vijayaditya and Cao, Yuan and Francis Bach and Hermansky, Hynek and Emmanuel Dupoux},
title = {Evaluating speech features with the Minimal-Pair ABX task (II): Resistance to noise},
booktitle = {Proc. of INTERSPEECH}
}

A long, deep and wide artificial neural net for robust speech recognition in unknown noise
Feipeng Li, Phani Sankar Nidadavolu and Hynek Hermansky
Proc. of INTERSPEECH – 2014

[bib]

@inproceedings{li-nidadavolu-hermansky:is2014b,
author = {Feipeng Li and Nidadavolu, Phani and Hermansky, Hynek},
title = {A long, deep and wide artificial neural net for robust speech recognition in unknown noise},
booktitle = {Proc. of INTERSPEECH}
}

Robust Feature Extraction Using Modulation Filtering of Autoregressive Models
Sriram Ganapathy, Sri Harish Mallidi and Hynek Hermansky
IEEE Transactions on Audio, Speech, and Language Processing – 2014

[bib]

@article{ganapathy-mallidi-hermansky:ieee,
author = {Ganapathy, Sriram and Mallidi, Sri Harish and Hermansky, Hynek},
title = {Robust Feature Extraction Using Modulation Filtering of Autoregressive Models}
}

Particle Filter Rejuvenation and Latent Dirichlet Allocation
Chandler May, Alex Clemmer and Benjamin Van Durme
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL) – 2014

[bib]

@inproceedings{may2014,
author = {May, Chandler and Alex Clemmer and Van Durme, Benjamin},
title = {Particle Filter Rejuvenation and Latent Dirichlet Allocation},
booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL)}
}

Back to Top

2013

Improved speech-to-text translation with the Fisher and Callhome Translated Corpus of Spanish-English Speech
Matt Post, Gaurav Kumar, Adam Lopez, Damianos Karakos, Chris Callison-Burch and Sanjeev Khudanpur
Proceedings of the International Workshop on Spoken Language Translation (IWSLT) – 2013

[abstract] [bib]

Abstract

Research into the translation of the output of automatic speech recognition (ASR) systems is hindered by the dearth of datasets developed for that explicit purpose. For Spanish--English translation, in particular, most parallel data available exists only in vastly different domains and registers. In order to support research on cross-lingual speech applications, we introduce the Fisher and Callhome Spanish-English Speech Translation Corpus, supplementing existing LDC audio and transcripts with (a) ASR 1-best, lattice, and oracle output produced by the Kaldi recognition system and (b) English translations obtained on Amazon's Mechanical Turk. The result is a four-way parallel dataset of Spanish audio, transcriptions, ASR lattices, and English translations of approximately 38 hours of speech, with defined training, development, and held-out test sets. We conduct baseline machine translation experiments using models trained on the provided training data, and validate the dataset by corroborating a number of known results in the field, including the utility of in-domain (information, conversational) training data, increased performance translating lattices (instead of recognizer 1-best output), and the relationship between word error rate and BLEU score.
@InProceedings{post-improved-2013,
author = {Post, Matt and Kumar, Gaurav and Lopez, Adam and Karakos, Damianos and Callison-Burch, Chris and Khudanpur, Sanjeev},
title = {Improved speech-to-text translation with the Fisher and Callhome Translated Corpus of Spanish-English Speech},
booktitle = {Proceedings of the International Workshop on Spoken Language Translation (IWSLT)}
}

Speech representation based on spectral dynamics
Hynek Hermansky
Proceedings of International Worskhop on Models and Analysis of Vocal emissions for Biomedical Applications – 2013

[bib]

@inproceedings{hermansky:maveba2013,
author = {Hermansky, Hynek},
title = {Speech representation based on spectral dynamics},
booktitle = {Proceedings of International Worskhop on Models and Analysis of Vocal emissions for Biomedical Applications},
address = {Firenze, Italy}
}

Monolingual Marginal Matching for Translation Model Adaptation
Ann Irvine, Chris Quirk and Hal Daume III
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) – 2013

[abstract] [bib]

Abstract

When using a machine translation (MT) model trained on OLD-domain parallel data to translate NEW-domain text, one major challenge is the large number of out-of-vocabulary and new-translation-sense words. We present a method to identify new translations of both known and unknown source language words that uses NEW-domain comparable document pairs. Starting with a joint distribution of source-target word pairs derived from the OLD-domain parallel corpus, our method recovers a new joint distribution that matches the marginal distributions of the NEW-domain comparable document pairs, while minimizing the divergence from the OLD-domain distribution. Adding these learned translations to our French-English MT model results in gains of about 2 BLEU points over strong baselines.
@InProceedings{irvineQuirkDaumeEMNLP13,
author = {Irvine, Ann and Chris Quirk and Hal Daume III},
title = {Monolingual Marginal Matching for Translation Model Adaptation},
booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)},
url = {http://www.aclweb.org/anthology/D/D13/D13-1109.pdf}
}

Measuring Machine Translation Errors in New Domains
Ann Irvine, John Morgan, Marine Carpuat, Hal Daume III and Dragos Munteanu
Transactions of the Association for Computational Linguistics (TACL) – 2013

[abstract] [bib]

Abstract

We develop two techniques for analyzing the effect of porting a machine translation system to a new domain. One is a macro-level analysis that measures how domain shift affects corpus-level evaluation; the second is a micro-level analysis for word-level errors. We apply these methods to understand what happens when a Parliament-trained phrase-based machine translation system is applied in four very different domains: news, medical texts, scientific articles and movie subtitles. We present quantitative and qualitative experiments that highlight opportunities for future research in domain adaptation for machine translation.
@article{mtDomainErrors_TACL:2013,
author = {Irvine, Ann and John Morgan and Marine Carpuat and Hal Daume III and Dragos Munteanu},
title = {Measuring Machine Translation Errors in New Domains},
booktitle = {Transactions of the Association for Computational Linguistics (TACL)},
url = {https://aclweb.org/anthology/Q/Q13/Q13-1035.pdf}
}

Perceptual Properties of Current Speech Recognition Technology, Invited Paper
Hynek Hermansky, Jordan R. Cohen and Richard M. Stern
Proceedings of IEEE, vol 101, No. 9 – 2013

[bib]

@inproceedings{hermansky-cohen-stern:ieee2013b:,
author = {Hermansky, Hynek and Jordan R. Cohen and Richard M. Stern},
title = {Perceptual Properties of Current Speech Recognition Technology, Invited Paper},
booktitle = {Proceedings of IEEE, vol 101, No. 9},
pages = {1968--1985}
}

SenseSpotting: Never let your parallel data tie you to an old domain
Marine Carpuat, Hal Daume III, Katharine Henry, Ann Irvine, Jagadeesh Jagarlamudi and Rachel Rudinger
Proceedings of the Association for Computational Linguistics (ACL) – 2013

[abstract] [bib]

Abstract

Words often gain new senses in new domains. Being able to automatically identify, from a corpus of monolingual text, which word tokens are being used in a previously unseen sense has applications to machine translation and other tasks sensitive to lexical semantics. We define a task, SENSESPOTTING, in which we build systems to spot tokens that have new senses in new domain text. Instead of difficult and expensive annotation, we build a gold-standard by leveraging cheaply available parallel corpora, targeting our approach to the problem of domain adaptation for machine translation. Our system is able to achieve F-measures of as much as 80%, when applied to word types it has never seen before. Our approach is based on a large set of novel features that capture varied aspects of how words change when used in new domains.
@InProceedings{sensespotting13,
author = {Marine Carpuat and Hal Daume III and Henry, Katharine and Irvine, Ann and Jagadeesh Jagarlamudi and Rudinger, Rachel},
title = {SenseSpotting: Never let your parallel data tie you to an old domain},
booktitle = {Proceedings of the Association for Computational Linguistics (ACL)},
url = {http://www.aclweb.org/anthology/P/P13/P13-1141.pdf}
}

Combining Bilingual and Comparable Corpora for Low Resource Machine Translation
Ann Irvine and Chris Callison-Burch
Proceedings of the ACL Workshop on Statistical Machine Translation (WMT) – 2013

[abstract] [bib]

Abstract

Statistical machine translation (SMT) performance suffers when models are trained on only small amounts of parallel data. The learned models typically have both low accuracy (incorrect translations and feature scores) and low coverage (high out-of-vocabulary rates). In this work, we use an additional data resource, comparable corpora, to improve both. Beginning with a small bitext and corresponding phrase-based SMT model, we improve coverage by using bilingual lexicon induction techniques to learn new translations from comparable corpora. Then, we supplement the model’s feature space with translation scores estimated over comparable corpora in order to improve accuracy. We observe improvements between 0.5 and 1.7 BLEU translating Tamil, Telugu, Bengali, Malayalam, Hindi, and Urdu into English.
@inProceedings{irvineCallisonBurchWMT13,
author = {Irvine, Ann and Callison-Burch, Chris},
title = {Combining Bilingual and Comparable Corpora for Low Resource Machine Translation},
booktitle = {Proceedings of the ACL Workshop on Statistical Machine Translation (WMT)},
url = {http://www.cs.jhu.edu/~anni/papers/irvineCCB_WMT13.pdf}
}

Supervised Bilingual Lexicon Induction with Multiple Monolingual Signals
Ann Irvine and Chris Callison-Burch
Proceedings of the North American Association for Computational Linguistics (NAACL) – 2013

[abstract] [bib]

Abstract

Prior research into learning translations from monolingual texts has treated the task as an unsupervised learning problem. Although many techniques take advantage of a seed bilingual lexicon, this work is the first to use that data for supervised learning to combine a diverse set of monolingual signals into a single discriminative model. Even in a low resource machine translation setting, where induced translations have the potential to improve performance substantially, it is reasonable to assume access to some amount of data to perform this kind of optimization. We report bilingual lexicon induction accuracies that are on average nearly 50% higher than an unsupervised baseline. Large gains in accuracy hold for all 22 languages (low and high resource) that we investigate.
@InProceedings{irvineCallisonBurch13,
author = {Irvine, Ann and Callison-Burch, Chris},
title = {Supervised Bilingual Lexicon Induction with Multiple Monolingual Signals},
booktitle = {Proceedings of the North American Association for Computational Linguistics (NAACL)},
url = {http://www.cs.jhu.edu/~anni/papers/irvineCallisonBuch-NAACL2013.pdf}
}

Dealing with Unknown Unknowns: Multi-stream Recognition of Speech, Invited Paper
Hynek Hermansky
Proceedings of IEEE, vol 101, No.5 – 2013

[bib]

@inproceedings{hermansky:ieee2013a,
author = {Hermansky, Hynek},
title = {Dealing with Unknown Unknowns: Multi-stream Recognition of Speech, Invited Paper},
booktitle = {Proceedings of IEEE, vol 101, No.5},
pages = {1076--1088}
}

Statistical Machine Translation in Low Resource Settings
Ann Irvine
Proceedings of the NAACL Student Research Workshop – 2013

[bib]

@InProceedings{irvineNAACLSRW13,
author = {Irvine, Ann},
title = {Statistical Machine Translation in Low Resource Settings},
booktitle = {Proceedings of the NAACL Student Research Workshop}
}

Quantifying the Value of Pronunciation Lexicons for Keyword Search in Low Resource Languages
Guoguo Chen, Sanjeev Khudanpur, Daniel Povey, Jan Trmal, David Yarowsky and Oguz Yilmaz
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on – 2013

Tags: Speech Recognition, Keyword Search, Information Retrieval, Morphology, Speech Synthesis  |  [bib]

@inproceedings{chen2013quantifying,
author = {Chen, Guoguo and Khudanpur, Sanjeev and Povey, Daniel and Jan Trmal and Yarowsky, David and Yilmaz, Oguz},
title = {Quantifying the Value of Pronunciation Lexicons for Keyword Search in Low Resource Languages},
booktitle = {Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on},
pages = {8560--8564},
url = {http://old-site.clsp.jhu.edu/~guoguo/papers/chen2013quantifying.pdf}
}

The (Un)faithful Machine Translator
Ruth Jones and Ann Irvine
ACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) – 2013

[bib]

@inProceedings{jonesIrvine,
author = {Ruth Jones and Irvine, Ann},
title = {The (Un)faithful Machine Translator},
booktitle = {ACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)},
url = {http://www.cs.jhu.edu/~anni/papers/jonesIrvineTranslation.pdf}
}

A framework for (under)specifying dependency syntax without overloading annotators
Nathan Schneider, Brendan O'Connor, Naomi P. Saphra, David Bamman, Manaal Faruqui, Noah Smith, Chris Dyer and Jason Baldridge
Linguistic Annotation Workshop at ACL – 2013

[bib]

@article{DBLP:journals/corr/SchneiderOSBFSDB13,
author = {Nathan Schneider and Brendan O'Connor and Saphra, Naomi and David Bamman and Manaal Faruqui and Noah Smith and Chris Dyer and Jason Baldridge},
title = {A framework for (under)specifying dependency syntax without overloading annotators},
booktitle = {Linguistic Annotation Workshop at ACL}
}

Using Proxies for OOV Keywords in the Keyword Search Task
Guoguo Chen, Oguz Yilmaz, Jan Trmal, Daniel Povey and Sanjeev Khudanpur
Proceedings of ASRU 2013 – 2013

Tags: Speech Recognition, Keyword Search, OOV Keywords, Proxy Keywords, Low Resource LVCSR  |  [bib]

@inproceedings{chen2013using,
author = {Chen, Guoguo and Yilmaz, Oguz and Jan Trmal and Povey, Daniel and Khudanpur, Sanjeev},
title = {Using Proxies for OOV Keywords in the Keyword Search Task},
booktitle = {Proceedings of ASRU 2013},
url = {http://old-site.clsp.jhu.edu/~guoguo/papers/chen2013using.pdf}
}

Fixed-Dimensional Acoustic Embeddings of Variable-Length Segments in Low-Resource Settings
Keith Levin, Katharine Henry, Aren Jansen and Karen Livescu
ASRU – 2013

[bib]

@inproceedings{Levin2013,
author = {Levin, Keith and Henry, Katharine and Jansen, Aren and Karen Livescu},
title = {Fixed-Dimensional Acoustic Embeddings of Variable-Length Segments in Low-Resource Settings},
booktitle = {ASRU}
}

Long, Deep and Wide Artificial Neural Nets for Dealing with Unexpected Noise in Machine Recognition of Speech
Hynek Hermansky
Proc. Text, Speech and Dialogue, Springer – 2013

[bib]

@inproceedings{hermansky:tsd2013f,
author = {Hermansky, Hynek},
title = {Long, Deep and Wide Artificial Neural Nets for Dealing with Unexpected Noise in Machine Recognition of Speech},
booktitle = {Proc. Text, Speech and Dialogue, Springer}
}

Improvements in Language Identification on the RATS Noisy Speech Corpus
Hynek Hermansky, Jeff Ma, Bing Zhang, Spyros Matsoukas, Sri Harish Mallidi and Feipeng Li
Proc. INTERSPEECH – 2013

[bib]

@inproceedings{hermansky-ma-zhang-matsoukas-mallidi-li:is2013:,
author = {Hermansky, Hynek and Jeff Ma and Bing Zhang and Spyros Matsoukas and Mallidi, Sri Harish and Feipeng Li},
title = {Improvements in Language Identification on the RATS Noisy Speech Corpus},
booktitle = {Proc. INTERSPEECH}
}

Robust Speaker Recognition Using Spectro-Temporal Autoregressive Models
Sri Harish Mallidi, Sriram Ganapathy and Hynek Hermansky
Proc. INTERSPEECH – 2013

[bib]

@inproceedings{mallidi-ganapathy-hermansky:is2013,
author = {Mallidi, Sri Harish and Ganapathy, Sriram and Hermansky, Hynek},
title = {Robust Speaker Recognition Using Spectro-Temporal Autoregressive Models},
booktitle = {Proc. INTERSPEECH}
}

Evaluating speech features with the Minimal-Pair ABX task: Analysis of the classical MFC/PLP pipeline
Thomas Schatz, Vijayaditya Peddinti, Francis Bach, Aren Jansen, Hynek Hermansky and Emmanuel Dupoux
Proc. INTERSPEECH – 2013

[bib]

@inproceedings{schatz-peddinti-bach-jansen-hermansky-dupoux:is2013,
author = {Thomas Schatz and Peddinti, Vijayaditya and Francis Bach and Jansen, Aren and Hermansky, Hynek and Emmanuel Dupoux},
title = {Evaluating speech features with the Minimal-Pair ABX task: Analysis of the classical MFC/PLP pipeline},
booktitle = {Proc. INTERSPEECH}
}

Multi-stream recognition of noisy speech with performance monitoring
Ehsan Variani, Feipeng Li and Hynek Hermansky
Proc. INTERSPEECH – 2013

[bib]

@inproceedings{variani-li-hermansky:is2013,
author = {Variani, Ehsan and Feipeng Li and Hermansky, Hynek},
title = {Multi-stream recognition of noisy speech with performance monitoring},
booktitle = {Proc. INTERSPEECH}
}

Text-to-Speech Inspired Duration Modeling for Improved Whole-Word Acoustic Models
Keith Kintzley, Aren Jansen and Hynek Hermansky
Proc. INTERSPEECH – 2013

[bib]

@inproceedings{kintzley-jansen-hermansky:is2013,
author = {Kintzley, Keith and Jansen, Aren and Hermansky, Hynek},
title = {Text-to-Speech Inspired Duration Modeling for Improved Whole-Word Acoustic Models},
booktitle = {Proc. INTERSPEECH}
}

Weak Top-Down Constraints For Unsupervised Acoustic Model Training
Aren Jansen, Samuel Thomas and Hynek Hermansky
Proc. ICASSP – 2013

[bib]

@inproceedings{jansen-thomas-hermansky:icassp2013,
author = {Jansen, Aren and Thomas, Samuel and Hermansky, Hynek},
title = {Weak Top-Down Constraints For Unsupervised Acoustic Model Training},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

Deep Neural Network Features and Semi-Supervised Training for Low Resource Speech Recognition
Samuel Thomas, Michael Seltzer, Kenneth Church and Hynek Hermansky
Proc. ICASSP – 2013

[bib]

@inproceedings{thomas-seltzer-church-hermansky:icassp2013,
author = {Thomas, Samuel and Michael Seltzer and Church, Kenneth and Hermansky, Hynek},
title = {Deep Neural Network Features and Semi-Supervised Training for Low Resource Speech Recognition},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

Effect Of Filter Bandwidth and Spectral Sampling Rate of Analysis Filterbank on Automatic Phoneme Recognition
Feipeng Li and Hynek Hermansky
Proc. ICASSP – 2013

[bib]

@inproceedings{li-hermansky:icassp2013,
author = {Feipeng Li and Hermansky, Hynek},
title = {Effect Of Filter Bandwidth and Spectral Sampling Rate of Analysis Filterbank on Automatic Phoneme Recognition},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

Frequency Offset Correction in Speech Without Detecting Pitch
Pascal Clark, Sri Harish Mallidi, Aren Jansen and Hynek Hermansky
Proc. ICASSP – 2013

[bib]

@inproceedings{clark-mallidi-jansen-hermansky:icassp2013,
author = {Pascal Clark and Mallidi, Sri Harish and Jansen, Aren and Hermansky, Hynek},
title = {Frequency Offset Correction in Speech Without Detecting Pitch},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

A Summary Of The 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition
Aren Jansen, Emmanuel Dupoux, Sharon Goldwater, Mark Johnson, Sanjeev Khudanpur, Kenneth Church, Naomi Feldman, Hynek Hermansky, Florian Metze, Richard Rose, Michael Seltzer, Pascal Clark, Ian Mcgraw, Balakrishnan Varadarajan, Erin Bennett, Benjamin Borschinger, Justin Chiu, Ewan Dunbar, Abdellah Fourtassi, David Harwath, Chia-Ying Lee, Keith Levin, Atta Norouzain, Vijayaditya Peddinti, Rachael Richardson, Thomas Schatz and Samuel Thomas
Proc. ICASSP – 2013

[bib]

@inproceedings{jansen-dupoux-goldwater-johnson-khudanpur-church-feldman-hermansky-metze-rose-seltzer-clark-mcgraw-varadarajan-bennett-borschinger-chiu-dunbar-fourtassi-harwath-lee-levin-norouzain-peddinti-richardson-schatz-thomas:icassp2013,
author = {Jansen, Aren and Emmanuel Dupoux and Sharon Goldwater and Mark Johnson and Khudanpur, Sanjeev and Church, Kenneth and Naomi Feldman and Hermansky, Hynek and Florian Metze and Richard Rose and Michael Seltzer and Pascal Clark and Ian Mcgraw and Varadarajan, Balakrishnan and Erin Bennett and Benjamin Borschinger and Justin Chiu and Ewan Dunbar and Abdellah Fourtassi and David Harwath and Chia-Ying Lee and Levin, Keith and Atta Norouzain and Peddinti, Vijayaditya and Rachael Richardson and Thomas Schatz and Thomas, Samuel},
title = {A Summary Of The 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

Mean Temporal Distance: Predicting ASR Error from Temporal Properties of Speech Signal
Hynek Hermansky, Ehsan Variani and Vijayaditya Peddinti
Proc. ICASSP – 2013

[bib]

@inproceedings{hermansky-variani-peddinti:icassp2013,
author = {Hermansky, Hynek and Variani, Ehsan and Peddinti, Vijayaditya},
title = {Mean Temporal Distance: Predicting ASR Error from Temporal Properties of Speech Signal},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

Filter-Bank Optimization for Frequency Domain Linear Prediction
Vijayaditya Peddinti and Hynek Hermansky
Proc. ICASSP – 2013

[bib]

@inproceedings{peddinti-hermansky:icassp2013,
author = {Peddinti, Vijayaditya and Hermansky, Hynek},
title = {Filter-Bank Optimization for Frequency Domain Linear Prediction},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

Developing a Speaker Identification System for The Darpa Rats Project
Oldrich Plchot, Spyros Matsoukas, Pavel Matejka, Najim Dehak, Jeff Ma, Sandro Cumani, Ondrej Glembek, Hynek Hermansky, Sri Harish Mallidi, Nima Mesgarani, Richard Schwartz, Mehdi Soufifar, Zheng-Hua Tan, Samuel Thomas, Bing Zhang and Xinhui Zhou
Proc. ICASSP – 2013

[bib]

@inproceedings{plchot-matsoukas-matejka-dehak-ma-cumani-glembek-hermansky-mallidi-mesgarani-schwartz-soufifar-tan-thomas-zhang-zhou:icassp2013,
author = {Oldrich Plchot and Spyros Matsoukas and Pavel Matejka and Najim Dehak and Jeff Ma and Sandro Cumani and Ondrej Glembek and Hermansky, Hynek and Mallidi, Sri Harish and Nima Mesgarani and Richard Schwartz and Mehdi Soufifar and Zheng-Hua Tan and Thomas, Samuel and Bing Zhang and Xinhui Zhou},
title = {Developing a Speaker Identification System for The Darpa Rats Project},
booktitle = {Proc. ICASSP},
address = {Vancouver, Canada}
}

Back to Top

2012

Regularized Auto-Associative Neural Networks for Speaker Verification
Sri Garimella, Sri Harish Mallidi and Hynek Hermansky
IEEE Signal Processing Letters – 2012

[bib]

@article{garimella-mallidi-hermansky:ieee2012,
author = {Garimella, Sri and Mallidi, Sri Harish and Hermansky, Hynek},
title = {Regularized Auto-Associative Neural Networks for Speaker Verification},
pages = {841--844}
}

Beyond Novelty Detection: Incongruent Events, when General and Specific Classifiers Disagree
D. D. Weinshall, A. A. Zweig, Hynek Hermansky, S. Kombrink, F. W. Ohl, J. Anemueller, J. H. Bach, L. van Gool, F. Nater, T. Pajdla, M. Havlena and M. Pavel
IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol. 34, No. 10 – 2012

[bib]

@article{weinshall-zweig-hermansky-kombrink-ohl-anemueller-bach-gool-nater-pajdla-havlena-pavel:ieee2012,
author = {D. D. Weinshall and A. A. Zweig and Hermansky, Hynek and S. Kombrink and F. W. Ohl and J. Anemueller and J. H. Bach and L. van Gool and F. Nater and T. Pajdla and M. Havlena and M. Pavel},
title = {Beyond Novelty Detection: Incongruent Events, when General and Specific Classifiers Disagree},
pages = {1886--1901}
}

A Flexible Solver for Finite Arithmetic Circuits
Nathaniel Filardo and Jason Eisner
Technical Communications of the 28th International Conference on Logic Programming, ICLP 2012 – 2012

[abstract] [bib]

Abstract

Arithmetic circuits arise in the context of weighted logic programming languages, such as Datalog with aggregation, or Dyna. A weighted logic program defines a generalized arithmetic circuit—the weighted version of a proof forest, with nodes having arbitrary rather than boolean values. In this paper, we focus on finite circuits. We present a flexible algorithm for efficiently querying node values as they change under updates to the circuit's inputs. Unlike traditional algorithms, ours is agnostic about which nodes are tabled (materialized), and can vary smoothly between the traditional strategies of forward and backward chaining. Our algorithm is designed to admit future generalizations, including cyclic and infinite circuits and propagation of delta updates.
@inproceedings{filardo-eisner-2012-iclp,
author = {Filardo, Nathaniel and Eisner, Jason},
title = {A Flexible Solver for Finite Arithmetic Circuits},
booktitle = {Technical Communications of the 28th International Conference on Logic Programming, ICLP 2012},
url = {http://cs.jhu.edu/~jason/papers/#iclp12}
}

MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors
Keith Kintzley, Aren Jansen and Hynek Hermansky
Proc. of INTERSPEECH – 2012

[abstract] [bib]

Abstract

The intrinsic advantages of whole-word acoustic modeling are offset by the problem of data sparsity. To address this, we present several parametric approaches to estimating intra-word phonetic timing models under the assumption that relative timing is independent of word duration. We show evidence that the timing of phonetic events is well described by the Gaussian distribution. We explore the construction of models in the absence of keyword examples (dictionary-based), when keyword examples are abundant (Gaussian mixture models), and also present a Bayesian approach which unifies the two. Applying these techniques in a point process model keyword spotting framework, we demonstrate a 55\% relative improvement in performance for models constructed from few examples.
@InProceedings{kintzley-jansen-hermansky:is2012a,
author = {Kintzley, Keith and Jansen, Aren and Hermansky, Hynek},
title = {MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors},
url = {http://old-site.clsp.jhu.edu/~ajansen/papers/IS2012c.pdf}
}

Inverting the Point Process Model for Fast Phonetic Keyword Search
Keith Kintzley, Aren Jansen, Kenneth Church and Hynek Hermansky
Proc. of INTERSPEECH – 2012

[abstract] [bib]

Abstract

Normally, we represent speech as a long sequence of frames and model the keyword with a relatively small set of parameters, commonly with a hidden Markov model (HMM). However, since the input speech is much longer than the keyword, suppose instead that we represent the speech as a relatively sparse set of impulses (roughly one per phoneme) and model the keyword as a filter-bank where each filter's impulse response relates to the likelihood of a phone at a given position within a word. Evaluating keyword detections can then be seen as a convolution of an impulse train with an array of filters. This view enables huge speedups; runtime no longer depends on the frame rate and is instead linear in the number of events (impulses). We apply this intuition to redesign the runtime engine behind the point process model for keyword spotting. We demonstrate impressive real-time speedups (500,000x faster than real-time) with minimal loss in search accuracy.
@InProceedings{kintzley-jansen-church-hermansky:is2012b,
author = {Kintzley, Keith and Jansen, Aren and Church, Kenneth and Hermansky, Hynek},
title = {Inverting the Point Process Model for Fast Phonetic Keyword Search},
address = {Portland, Oregon, USA},
publisher = {International Speech Communication Association},
url = {http://old-site.clsp.jhu.edu/~ajansen/papers/IS2012d.pdf}
}

Temporal Resolution Analysis in Frequency Domain Linear Prediction
Sriram Ganapathy and Hynek Hermansky
Express Letters section of the Journal of the Acoustical Society of America – 2012

[bib]

@article{ganapathy-hermansky:jasa2012,
author = {Ganapathy, Sriram and Hermansky, Hynek},
title = {Temporal Resolution Analysis in Frequency Domain Linear Prediction}
}

Findings of the 2012 Workshop on Statistical Machine Translation
Chris Callison-Burch, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut and Lucia Specia
Proceedings of the Seventh Workshop on Statistical Machine Translation – 2012

[abstract] [bib]

Abstract

This paper presents the results of the WMT12 shared tasks, which included a translation task, a task for machine translation evaluation metrics, and a task for run-time estimation of machine translation quality. We conducted a large-scale manual evaluation of 103 machine translation systems submitted by 34 teams. We used the ranking of these systems to mea- sure how strongly automatic metrics correlate with human judgments of translation quality for 12 evaluation metrics. We introduced a new quality estimation task this year, and evaluated submissions from 11 teams.
@inproceedings{callisonburch-EtAl:2012:WMT,
author = {Callison-Burch, Chris and Philipp Koehn and Christof Monz and Post, Matt and Radu Soricut and Lucia Specia},
title = {Findings of the 2012 Workshop on Statistical Machine Translation},
booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation},
address = {Montr\'eal, Canada},
publisher = {Association for Computational Linguistics},
pages = {10--51},
url = {http://cs.jhu.edu/~ccb/publications/findings-of-the-wmt12-shared-tasks.pdf}
}

Constructing Parallel Corpora for Six Indian Languages via Crowdsourcing
Matt Post, Chris Callison-Burch and Miles Osborne
Proceedings of the Seventh Workshop on Statistical Machine Translation – 2012

[abstract] [bib]

Abstract

Recent work has established the efficacy of Amazon's Mechanical Turk for constructing parallel corpora for machine translation research. We apply this to building a collection of parallel corpora between English and six languages from the Indian subcontinent: Bengali, Hindi, Malayalam, Tamil, Telugu, and Urdu. These languages are low-resource, under-studied, and exhibit linguistic phenomena that are difficult for machine translation. We conduct a variety of baseline experiments and analysis, and release the data to the community.
@inproceedings{post-callisonburch-osborne:2012:WMT,
author = {Post, Matt and Callison-Burch, Chris and Miles Osborne},
title = {Constructing Parallel Corpora for Six Indian Languages via Crowdsourcing},
booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation},
address = {Montr\'eal, Canada},
publisher = {Association for Computational Linguistics},
pages = {401--409},
url = {http://www.aclweb.org/anthology/W12-3152}
}

Using Categorial Grammar to Label Translation Rules
Jonathan Weese, Chris Callison-Burch and Adam Lopez
Proceedings of the Seventh Workshop on Statistical Machine Translation – 2012

[abstract] [bib]

Abstract

Adding syntactic labels to synchronous context-free translation rules can improve performance, but labeling with phrase structure constituents, as in GHKM (Galley et al., 2004), excludes potentially useful translation rules. SAMT (Zollmann and Venugopal, 2006) introduces heuristics to create new non-constituent labels, but these heuristics introduce many complex labels and tend to add rarely-applicable rules to the translation grammar. We introduce a labeling scheme based on categorial grammar, which allows syntactic labeling of many rules with a minimal, well-motivated label set. We show that our labeling scheme performs comparably to SAMT on an Urdu–English translation task, yet the label set is an order of magnitude smaller, and translation is twice as fast.
@inproceedings{weese-callisonburch-lopez:2012:WMT,
author = {Weese, Jonathan and Callison-Burch, Chris and Lopez, Adam},
title = {Using Categorial Grammar to Label Translation Rules},
booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation},
address = {Montr\'eal, Canada},
publisher = {Association for Computational Linguistics},
pages = {222--231},
url = {http://cs.jhu.edu/~ccb/publications/using-categorial-grammar-to-label-translation-rules.pdf}
}

Joshua 4.0: Packing, PRO, and Paraphrases
Juri Ganitkevitch, Yuan Cao, Jonathan Weese, Matt Post and Chris Callison-Burch
Proceedings of the Seventh Workshop on Statistical Machine Translation – 2012

[abstract] [bib]

Abstract

We present Joshua 4.0, the newest version of our open-source decoder for parsing-based statistical machine translation. The main contributions in this release are the introduction of a compact grammar representation based on packed tries, and the integration of our implementation of pairwise ranking optimization, J-PRO. We further present the extension of the Thrax SCFG grammar extractor to pivot-based extraction of syntactically informed sentential paraphrases.
@inproceedings{ganitkevitch-EtAl:2012:WMT,
author = {Ganitkevitch, Juri and Cao, Yuan and Weese, Jonathan and Post, Matt and Callison-Burch, Chris},
title = {Joshua 4.0: Packing, PRO, and Paraphrases},
booktitle = {Proceedings of the Seventh Workshop on Statistical Machine Translation},
address = {Montr\'eal, Canada},
publisher = {Association for Computational Linguistics},
pages = {283--291},
url = {http://cs.jhu.edu/~ccb/publications/joshua-4.0.pdf}
}

Monolingual Distributional Similarity for Text-to-Text Generation
Juri Ganitkevitch, Benjamin Van Durme and Chris Callison-Burch
*SEM First Joint Conference on Lexical and Computational Semantics – 2012

[abstract] [bib]

Abstract

Previous work on paraphrase extraction and application has relied on either parallel datasets, or on distributional similarity metrics over large text corpora. Our approach combines these two orthogonal sources of information and directly integrates them into our paraphrasing system’s log-linear model. We compare different distributional similarity feature-sets and show significant improvements in grammaticality and meaning retention on the example text-to-text generation task of sentence compression, achieving state-of-the-art quality.
@inproceedings{Ganitkevitch-etal:2012:StarSEM,
author = {Ganitkevitch, Juri and Van Durme, Benjamin and Callison-Burch, Chris},
title = {Monolingual Distributional Similarity for Text-to-Text Generation},
booktitle = {*SEM First Joint Conference on Lexical and Computational Semantics},
address = {Montreal},
publisher = {Association for Computational Linguistics},
url = {http://cs.jhu.edu/~ccb/publications/monolingual-distributional-similarity-for-text-to-text-generation.pdf}
}

Machine Translation of Arabic Dialects
Rabih Zbib, Erika Malchiodi, Jacob Devlin, David Stallard, Spyros Matsoukas, Richard Schwartz, John Makhoul, Omar Zaidan and Chris Callison-Burch
The 2012 Conference of the North American Chapter of the Association for Computational Linguistics – 2012

[abstract] [bib]

Abstract

Arabic dialects present many challenges for machine translation, not least of which is the lack of data resources. We use crowdsourcing to cheaply and quickly build Levantine-English and Egyptian-English parallel corpora, consisting of 1.1M words and 380k words, respectively. The dialect sentences are selected from a large corpus of Arabic web text, and translated using Mechanical Turk. We use this data to build Dialect Arabic MT systems. Small amounts of dialect data have a dramatic impact on the quality of translation. When translating Egyptian and Levantine test sets, our Dialect Arabic MT system performs 5.8 and 6.8 BLEU points higher than a Modern Standard Arabic MT system trained on a 150 million word Arabic-English parallel corpus -- over 100 times the amount of data as our dialect corpora.
@inproceedings{Zbib-etal:2012:NAACL,
author = {Rabih Zbib and Erika Malchiodi and Jacob Devlin and David Stallard and Spyros Matsoukas and Richard Schwartz and John Makhoul and Zaidan, Omar and Callison-Burch, Chris},
title = {Machine Translation of Arabic Dialects},
booktitle = {The 2012 Conference of the North American Chapter of the Association for Computational Linguistics},
address = {Montreal},
publisher = {Association for Computational Linguistics},
url = {http://cs.jhu.edu/~ccb/publications/machine-translation-of-arabic-dialects.pdf}
}

Training and Evaluating a Statistical Part of Speech Tagger for Natural Language Applications using Kepler Workflows
Doug Briesch, Reginald Hobbs, Claire Jaja, Brian Kjersten and Clare Voss
Procedia Computer Science – 2012

[abstract] [bib]

Abstract

A core technology of natural language processing (NLP) incorporated into many text processing applications is a part of speech (POS) tagger, a software component that labels words in text with syntactic tags such as noun, verb, adjective, etc. These tags may then be used within more complex tasks such as parsing, question answering, and machine translation (MT). In this paper we describe the phases of our work training and evaluating statistical POS taggers on Arabic texts and their English translations using Kepler workflows. While the original objectives for encapsulating our research code within Kepler workflows were driven by software engineering needs to document and verify the re usability of our software, our research benefitted as well: the ease of rapid retraining and testing enabled our researchers to detect reporting discrepancies, document their source, independently validating the correct results.
@article{Briesch20121588,
author = {Doug Briesch and Reginald Hobbs and Claire Jaja and Kjersten, Brian and Clare Voss},
title = {Training and Evaluating a Statistical Part of Speech Tagger for Natural Language Applications using Kepler Workflows},
pages = {1588 - 1594},
url = {http://www.sciencedirect.com/science/article/pii/S1877050912002955}
}

Annotated Gigaword
Courtney Napoles, Matt Gormley and Benjamin Van Durme
AKBC-WEKEX Workshop at NAACL 2012 – 2012

[bib]

@inproceedings{napoles-EtAl:2012:Agiga,
author = {Napoles, Courtney and Gormley, Matt and Van Durme, Benjamin},
title = {Annotated Gigaword},
booktitle = {AKBC-WEKEX Workshop at NAACL 2012}
}

Cost-Sensitive Dynamic Feature Selection
He He, Hal Daumé III and Jason Eisner
ICML Workshop on Inferning: Interactions between Inference and Learning – 2012

[abstract] [bib]

Abstract

We present an instance-specific test-time dynamic feature selection algorithm. Our algorithm sequentially chooses features given previously selected features and their values. It stops the selection process to make a prediction according to a user-specified accuracy-cost trade-off. We cast the sequential decision-making problem as a Markov Decision Process and apply imitation learning techniques. We address the problem of learning and inference jointly in a simple multiclass classification setting. Experimental results on UCI datasets show that our approach achieves the same or higher accuracy using only a small fraction of features than static feature selection methods.
@inproceedings{he-et-al-2012-icmlw,
author = {He He and Hal Daumé III and Eisner, Jason},
title = {Cost-Sensitive Dynamic Feature Selection},
booktitle = {ICML Workshop on Inferning: Interactions between Inference and Learning},
url = {http://cs.jhu.edu/~jason/papers/#icmlw12-dynfeat}
}

Fast and Accurate Prediction via Evidence-Specific MRF Structure
Veselin Stoyanov and Jason Eisner
ICML Workshop on Inferning: Interactions between Inference and Learning – 2012

[abstract] [bib]

Abstract

We are interested in speeding up approximate inference in Markov Random Fields (MRFs). We present a new method that uses gates—binary random variables that determine which factors of the MRF to use. Which gates are open depends on the observed evidence; when many gates are closed, the MRF takes on a sparser and faster structure that omits "unnecessary" factors. We train parameters that control the gates, jointly with the ordinary MRF parameters, in order to locally minimize an objective that combines loss and runtime.
@inproceedings{stoyanov-eisner-2012-icmlw,
author = {Stoyanov, Veselin and Eisner, Jason},
title = {Fast and Accurate Prediction via Evidence-Specific MRF Structure},
booktitle = {ICML Workshop on Inferning: Interactions between Inference and Learning},
url = {http://cs.jhu.edu/~jason/papers/#icmlw12-gates}
}

Implicitly Intersecting Weighted Automata using Dual Decomposition
Michael Paul and Jason Eisner
Proceedings of NAACL-HLT – 2012

[abstract] [bib]

Abstract

We propose an algorithm to find the best path through an intersection of arbitrarily many weighted automata, without actually performing the intersection. The algorithm is based on dual decomposition: the automata attempt to agree on a string by communicating about features of the string. We demonstrate the algorithm on the Steiner consensus string problem, both on synthetic data and on consensus decoding for speech recognition. This involves implicitly intersecting up to 100 automata.
@inproceedings{paul-eisner-2012-naacl,
author = {Paul, Michael and Eisner, Jason},
title = {Implicitly Intersecting Weighted Automata using Dual Decomposition},
booktitle = {Proceedings of NAACL-HLT},
pages = {232--242},
url = {http://cs.jhu.edu/~jason/papers/#naacl12-dd}
}

Unsupervised Learning on an Approximate Corpus
Jason Smith and Jason Eisner
Proceedings of NAACL-HLT – 2012

[abstract] [bib]

Abstract

Unsupervised learning techniques can take advantage of large amounts of unannotated text, but the largest text corpus (the Web) is not easy to use in its full form. Instead, we have statistics about this corpus in the form of n-gram counts (Brants and Franz, 2006). While n-gram counts do not directly provide sentences, a distribution over sentences can be estimated from them in the same way that n-gram language models are estimated. We treat this distribution over sentences as an approximate corpus and show how unsupervised learning can be performed on such a corpus using variational inference. We compare hidden Markov model (HMM) training on exact and approximate corpora of various sizes, measuring speed and accuracy on unsupervised part-of-speech tagging.
@inproceedings{smith-eisner-2012,
author = {Smith, Jason and Eisner, Jason},
title = {Unsupervised Learning on an Approximate Corpus},
booktitle = {Proceedings of NAACL-HLT},
pages = {131--141},
url = {http://cs.jhu.edu/~jason/papers/#naacl12-ngram}
}

Minimum-Risk Training of Approximate CRF-Based NLP Systems
Veselin Stoyanov and Jason Eisner
Proceedings of NAACL-HLT – 2012

[abstract] [bib]

Abstract

Conditional Random Fields (CRFs) are a popular formalism for structured prediction in NLP. It is well known how to train CRFs with certain topologies that admit exact inference, such as linear-chain CRFs. Some NLP phenomena, however, suggest CRFs with more complex topologies. Should such models be used, considering that they make exact inference intractable? Stoyanov et al. (2011) re- cently argued for training parameters to minimize the task-specific loss of whatever approximate inference and decoding methods will be used at test time. We apply their method to three NLP problems, showing that (i) using more complex CRFs leads to improved performance, and that (ii) minimum-risk training learns more accurate models.
@inproceedings{stoyanov-eisner-2012-naacl,
author = {Stoyanov, Veselin and Eisner, Jason},
title = {Minimum-Risk Training of Approximate CRF-Based NLP Systems},
booktitle = {Proceedings of NAACL-HLT},
pages = {120--130},
url = {http://cs.jhu.edu/~jason/papers/#naacl12-risk}
}

Learned Prioritization for Trading Off Accuracy and Speed
Jiarong Jiang, Adam Teichert, Hal Daumé III and Jason Eisner
ICML Workshop on Inferning: Interactions between Inference and Learning – 2012

[abstract] [bib]

Abstract

Users want natural language processing (NLP) systems to be both fast and accurate, but quality often comes at the cost of speed. The field has been manually exploring various speed-accuracy tradeoffs for particular problems or datasets. We aim to explore this space automatically, focusing here on the case of agenda-based syntactic parsing (Kay, 1986). Unfortunately, off-the-shelf reinforcement learning techniques fail to learn good policies: the state space is too large to explore naively. We propose a hybrid reinforcement/apprenticeship learning algorithm that, even with few inexpensive features, can automatically learn weights that achieve competitive accuracies at significant improvements in speed over state-of-the-art baselines.
@inproceedings{jiang-et-al-2012-icmlw,
author = {Jiarong Jiang and Teichert, Adam and Hal Daumé III and Eisner, Jason},
title = {Learned Prioritization for Trading Off Accuracy and Speed},
booktitle = {ICML Workshop on Inferning: Interactions between Inference and Learning},
url = {http://cs.jhu.edu/~jason/papers/#icmlw12-ldp}
}

Shared Components Topic Models
Matt Gormley, Mark Dredze, Benjamin Van Durme and Jason Eisner
Proceedings of NAACL-HLT – 2012

[abstract] [bib]

Abstract

With a few exceptions, extensions to latent Dirichlet allocation (LDA) have focused on the distribution over topics for each document. Much less attention has been given to the underlying structure of the topics themselves. As a result, most topic models generate topics independently from a single underlying distribution and require millions of parameters, in the form of multinomial distributions over the vocabulary. In this paper, we introduce the Shared Components Topic Model (SCTM), in which each topic is a normalized product of a smaller number of underlying component distributions. Our model learns these component distributions and the structure of how to combine subsets of them into topics. The SCTM can represent topics in a much more compact representation than LDA and achieves better perplexity with fewer parameters.
@inproceedings{gormley-et-al-2012-naacl,
author = {Gormley, Matt and Dredze, Mark and Van Durme, Benjamin and Eisner, Jason},
title = {Shared Components Topic Models},
booktitle = {Proceedings of NAACL-HLT},
pages = {783--792},
url = {http://cs.jhu.edu/~jason/papers/#naacl12-sctm}
}

Space Efficiencies in Discourse Modeling via Conditional Random Sampling
Brian Kjersten and Benjamin Van Durme
2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies – 2012

[abstract] [bib]

Abstract

Recent exploratory efforts in discourse-level language modeling have relied heavily on calculating Pointwise Mutual Information (PMI), which involves significant computation when done over large collections. Prior work has required aggressive pruning or independence assumptions to compute scores on large collections. We show the method of Conditional Random Sampling, thus far an underutilized technique, to be a space-efficient means of representing the sufficient statistics in discourse that underly recent PMI-based work. This is demonstrated in the context of inducing Shankian script-like structures over news articles.
@inproceedings{KjerstenVanDurme2012,
author = {Kjersten, Brian and Van Durme, Benjamin},
title = {Space Efficiencies in Discourse Modeling via Conditional Random Sampling},
booktitle = {2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
address = {Montreal, Canada},
publisher = {Association for Computational Linguistics},
pages = {513-517},
url = {http://www.aclweb.org/anthology/N/N12/N12-1056.pdf}
}

Stylometric Analysis of Scientific Articles
Shane Bergsma, Matt Post and David Yarowsky
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies – 2012

Tags: stylometry, syntax,   |  [abstract] [bib]

Abstract

We present an approach to automatically recover hidden attributes of scientific articles, such as whether the author is a native English speaker, whether the author is a male or a female, and whether the paper was published in a conference or workshop proceedings. We train classifiers to predict these attributes in computational linguistics papers. The classifiers perform well in this challenging domain, identifying non-native writing with 95% accuracy (over a baseline of 67%). We show the benefits of using syntactic features in stylometry; syntax leads to significant improvements over bag-of-words models on all three tasks, achieving 10% to 25% relative error reduction. We give a detailed analysis of which words and syntax most predict a particular attribute, and we show a strong correlation between our predictions and a paper’s number of citations.
@inproceedings{bergsma-post-yarowsky:2012:NAACL-HLT,
author = {Bergsma, Shane and Post, Matt and Yarowsky, David},
title = {Stylometric Analysis of Scientific Articles},
booktitle = {Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
address = {Montr\'eal, Canada},
publisher = {Association for Computational Linguistics},
pages = {327--337},
url = {http://www.aclweb.org/anthology/N12-1033}
}

Judging Grammaticality with Count-Induced Tree Substitution Grammars
Francis Ferraro, Matt Post and Benjamin Van Durme
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP – 2012

[abstract] [bib]

Abstract

Prior work has shown the utility of syntactic tree fragments as features in judging the grammaticality of text. To date such fragments have been extracted from derivations of Bayesian-induced Tree Substitution Grammars (TSGs). Evaluating on discriminative coarse and fine grammaticality classification tasks, we show that a simple, deterministic, count-based approach to fragment identification performs on par with the more complicated grammars of Post (2011). This represents a significant reduction in complexity for those interested in the use of such fragments in the development of systems for the educational domain.
@inproceedings{ferraro-post-vandurme:2012:BEA,
author = {Ferraro, Francis and Post, Matt and Van Durme, Benjamin},
title = {Judging Grammaticality with Count-Induced Tree Substitution Grammars},
booktitle = {Proceedings of the Seventh Workshop on Building Educational Applications Using NLP},
address = {Montr\'eal, Canada},
publisher = {Association for Computational Linguistics},
pages = {116--121},
url = {http://www.aclweb.org/anthology/W12-2013}
}

Toward Tree Substitution Grammars with Latent Annotations
Francis Ferraro, Benjamin Van Durme and Matt Post
Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure – 2012

[abstract] [bib]

Abstract

We provide a model that extends the split-merge framework of Petrov et al. (2006) to jointly learn latent annotations and Tree Substitution Grammars (TSGs). We then conduct a variety of experiments with this model, first inducing grammars on a portion of the Penn Treebank and the Korean Treebank 2.0, and next experimenting with grammar refinement from a single nonterminal and from the Universal Part of Speech tagset. We present qualitative analysis showing promising signs across all experiments that our combined approach successfully provides for greater flexibility in grammar induction within the structured guidance provided by the treebank, leveraging the complementary natures of these two approaches.
@inproceedings{ferraro-vandurme-post:2012:WILS,
author = {Ferraro, Francis and Van Durme, Benjamin and Post, Matt},
title = {Toward Tree Substitution Grammars with Latent Annotations},
booktitle = {Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure},
address = {Montr\'eal, Canada},
publisher = {Association for Computational Linguistics},
pages = {23--30},
url = {http://www.aclweb.org/anthology/W12-1904}
}

Toward Statistical Machine Translation without Parallel Corpora
Alex Klementiev, Ann Irvine, Chris Callison-Burch and David Yarowsky
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL) – 2012

[abstract] [bib]

Abstract

We estimate the parameters of a phrase-based statistical machine translation system from monolingual corpora instead of a bilingual parallel corpus. We extend existing research on bilingual lexicon induction to estimate both lexical and phrasal translation probabilities for MT-scale phrase-tables. We propose a novel algorithm to estimate re-ordering probabilities from monolingual data. We report translation results for an end-to-end translation system using these monolingual features alone. Our method only requires monolingual corpora in source and target languages, a small bilingual dictionary, and a small bitext for tuning feature weights. In this paper, we examine an idealization where a phrase-table is given. We examine the degradation in translation performance when bilingually estimated translation probabilities are removed, and show that 82%+ of the loss can be recovered with monolingually estimated features alone. We further show that our monolingual features add 1.5 BLEU points when combined with standard bilingually estimated phrase table features.
@InProceedings{klementiev-etal:2012:EACL,
author = {Alex Klementiev and Irvine, Ann and Callison-Burch, Chris and Yarowsky, David},
title = {Toward Statistical Machine Translation without Parallel Corpora},
booktitle = {Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL)},
address = {Avignon, France},
publisher = {Association for Computational Linguistics},
url = {http://cs.jhu.edu/~ccb/publications/toward-statistical-machine-translation-without-parallel-corpora.pdf}
}

Learning Multivariate Distributions by Competitive Assembly of Marginals
Francisco Sanchez-Vega, Jason Eisner, Laurent Younes and Donald Geman
IEEE Transactions on Pattern Analysis and Machine Intelligence – 2012

[abstract] [bib]

Abstract

We present a new framework for learning high-dimensional multivariate probability distributions from estimated marginals. The approach is motivated by compositional models and Bayesian networks, and designed to adapt to small sample sizes. We start with a large, overlapping set of elementary statistical building blocks, or "primitives," which are low-dimensional marginal distributions learned from data. Each variable may appear in many primitives. Subsets of primitives are combined in a lego-like fashion to construct a probabilistic graphical model; only a small fraction of the primitives will participate in any valid construction. Since primitives can be precomputed, parameter estimation and structure search are separated. Model complexity is controlled by strong biases; we adapt the primitives to the amount of training data and impose rules which restrict the merging of them into allowable compositions. The likelihood of the data decomposes into a sum of local gains, one for each primitive in the final structure. We focus on a specific subclass of networks which are binary forests. Structure optimization corresponds to an integer linear program and the maximizing composition can be computed for reasonably large numbers of variables. Performance is evaluated using both synthetic data and real datasets from natural language processing and computational biology.
@article{sanchezvega-et-al-2012,
author = {Francisco Sanchez-Vega and Eisner, Jason and Laurent Younes and Donald Geman},
title = {Learning Multivariate Distributions by Competitive Assembly of Marginals},
url = {http://cs.jhu.edu/~jason/papers/#pami12}
}

Confidence-Weighted Linear Classification for Text Categorization
Koby Crammer, Mark Dredze and Fernando Pereira
2012

[abstract] [bib]

Abstract

Confidence-weighted online learning is a generalization of margin-based learning of linear classifiers in which the margin constraint is replaced by a probabilistic constraint based on a distribution over classifier weights that is updated online as examples are observed. The distribution captures a notion of confidence on classifier weights, and in some cases it can also be interpreted as replacing a single learning rate by adaptive per-weight rates. Confidence-weighted learning was motivated by the statistical properties of natural language classification tasks, where most of the informative features are relatively rare. We investigate several versions of confidence-weighted learning that use a Gaussian distribution over weight vectors, updated at each observed example to achieve high probability of correct classification for the example. Empirical evaluation on a range of text-categorization tasks show that our algorithms improve over other state-of-the-art online and batch methods, learn faster in the online setting, and lead to better classifier combination for a type of distributed training commonly used in cloud computing.
@article{Pereira:2011fk,
author = {Koby Crammer and Dredze, Mark and Fernando Pereira},
title = {Confidence-Weighted Linear Classification for Text Categorization}
}

New H-Infinity Bounds for the Recursive Least Squares Algorithm Exploiting Input Structure
Koby Crammer, Alex Kulesza and Mark Dredze
International Conference on Acoustics, Speech, and Signal Processing (ICASSP) – 2012

[bib]

@inproceedings{Crammer:2012fk,
author = {Koby Crammer and Alex Kulesza and Dredze, Mark},
title = {New H-Infinity Bounds for the Recursive Least Squares Algorithm Exploiting Input Structure},
booktitle = {International Conference on Acoustics, Speech, and Signal Processing (ICASSP)}
}

Use of Modality and Negation in Semantically-Informed Syntactic MT
Kathryn Baker, Bonnie Dorr, Michael Bloodgood, Chris Callison-Burch, Nathaniel Filardo, Christine Piatko, Lori Levin and Scott Miller
Computational Linguistics – 2012

[abstract] [bib]

Abstract

This article describes the resource- and system-building efforts of an eight-week JHU Human Language Technology Center of Excellence Summer Camp for Applied Language Exploration (SCALE-2009) on Semantically-Informed Machine Translation (SIMT). We describe a new modality/negation (MN) annotation scheme, a (publicly available) MN lexicon, and two au- tomated MN taggers that we built using the annotation scheme and lexicon. Our annotation scheme isolates three components of modality and negation: a trigger (a word that conveys modality or negation), a target (an action associated with modality or negation) and a holder (an experiencer of modality). We describe how our MN lexicon was produced semi-automatically and we demonstrate that a structure-based MN tagger results in precision around 86% (depending on genre) for tagging of a standard LDC data set. We apply our MN annotation scheme to statistical machine translation using a syntactic framework that supports the inclusion of semantic annotations. Syntactic tags enriched with semantic annotations are assigned to parse trees in the target-language training texts through a process of tree grafting. While the focus of our work is modality and negation, the tree grafting procedure is general and supports other types of semantic information. We exploit this capability by including named entities, produced by a pre-existing tagger, in addition to the MN elements produced by the taggers described in this paper. The resulting system significantly outperformed a linguistically naïve baseline model (Hiero), and reached the highest scores yet reported on the NIST 2009 Urdu-English test set. This finding supports the hypothesis that both syntactic and semantic information can improve translation quality.
@article{baker-etal:2012:CL,
author = {Kathryn Baker and Bonnie Dorr and Michael Bloodgood and Callison-Burch, Chris and Filardo, Nathaniel and Christine Piatko and Lori Levin and Scott Miller},
title = {Use of Modality and Negation in Semantically-Informed Syntactic MT},
url = {http://cs.jhu.edu/~ccb/publications/modality-and-negation-in-semantically-informed-syntactic-mt.pdf}
}

Processing Informal, Romanized Pakistani Text Messages
Ann Irvine, Jonathan Weese and Chris Callison-Burch
Proceedings of the NAACL Workshop on Language in Social Media – 2012

[abstract] [bib]

Abstract

Regardless of language, the standard character set for text messages (SMS) and many other social media platforms is the Roman alphabet. There are romanization conventions for some character sets, but they are used inconsistently in informal text, such as SMS. In this work, we convert informal, romanized Urdu messages into the native Arabic script and normalize non-standard SMS language. Doing so prepares the messages for existing downstream processing tools, such as machine translation, which are typically trained on well-formed, native script text. Our model combines information at the word and character levels, allowing it to handle out-of-vocabulary items. Compared with a baseline deterministic approach, our system reduces both word and character error rate by over 50%.
@inproceedings{IrvineWeeseCallisonburchSMS12,
author = {Irvine, Ann and Weese, Jonathan and Callison-Burch, Chris},
title = {Processing Informal, Romanized Pakistani Text Messages},
booktitle = {Proceedings of the NAACL Workshop on Language in Social Media},
address = {Montreal, Canada},
publisher = {Association for Computational Linguistics},
url = {http://www.cs.jhu.edu/~anni/papers/urduSMS/urduSMS.pdf}
}

Digitizing 18th-Century French Literature: Comparing transcription methods for a critical edition text
Ann Irvine, Laure Marcellesi and Afra Zomorodian
Proceedings of the NAACL Workshop on Computational Linguistics for Literature – 2012

[abstract] [bib]

Abstract

We compare four methods for transcribing early printed texts. Our comparison is through a case-study of digitizing an eighteenth-century French novel for a new critical edition: the 1784 Lettres taïtiennes by Joséphine de Monbart. We provide a detailed error analysis of transcription by optical character recognition (OCR), non-expert humans, and expert humans and weigh each technique based on accuracy, speed, cost and the need for scholarly overhead. Our findings are relevant to 18th-century French scholars as well as the entire community of scholars working to preserve, present, and revitalize interest in literature published before the digital age.
@inproceedings{IrvineMarcellesiZomorodianFrench12,
author = {Irvine, Ann and Laure Marcellesi and Afra Zomorodian},
title = {Digitizing 18th-Century French Literature: Comparing transcription methods for a critical edition text},
booktitle = {Proceedings of the NAACL Workshop on Computational Linguistics for Literature},
address = {Montreal, Canada},
publisher = {Association for Computational Linguistics},
url = {http://www.cs.jhu.edu/~anni/papers/IrvineMonbartNAACL.pdf}
}

Expectations of Word Sense in Parallel Corpora
Xuchen Yao, Benjamin Van Durme and Chris Callison-Burch
NAACL – 2012

[bib]

@inproceedings{Yao2012NAACL,
author = {Yao, Xuchen and Van Durme, Benjamin and Callison-Burch, Chris},
title = {Expectations of Word Sense in Parallel Corpora},
booktitle = {NAACL},
url = {http://cs.jhu.edu/~xuchen/paper/Yao2012NAACL.pdf}
}

Semantics-based Question Generation and Implementation
Xuchen Yao, Gosse Bouma and Zhaonian Zhang
Dialogue and Discourse, Special Issue on Question Generation – 2012

[bib]

@article{Yao2012DDqg,
author = {Yao, Xuchen and Gosse Bouma and Zhang, Zhaonian},
title = {Semantics-based Question Generation and Implementation},
pages = {11-42},
url = {http://cs.jhu.edu/~xuchen/paper/Yao2012DDqg.pdf}
}

Sample Selection for Large-scale MT Discriminative Training
Yuan Cao and Sanjeev Khudanpur
Proceedings of the Annual Conference of the Association for Machine Translation in the Americas(AMTA) – 2012

[bib]

@Proceedings{sampleselection,
author = {Yuan Cao and Khudanpur, Sanjeev},
title = {Sample Selection for Large-scale MT Discriminative Training},
address = {San Diego, US}
}

Automatic Measurement of Positive and Negative Voice Onset Time
Katharine Henry, Morgan Sonderegger and Joseph Keshet
Interspeech – 2012

[bib]

@inproceedings{VOT2012,
author = {Henry, Katharine and Morgan Sonderegger and Joseph Keshet},
title = {Automatic Measurement of Positive and Negative Voice Onset Time},
booktitle = {Interspeech}
}

Generating Exact Lattices in The WFST Framework
Daniel Povey, Mirko Hannemann, Gilles Boulianne, Luk�� Burget, Arnab Ghoshal, Milo� Janda, Martin Karafi�t, Stefan Kombrink, Petr Motl�ček, Yanmin Qian, Korbinian Riedhammer, Karel Vesel� and Thang Vu
Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing – 2012

[bib]

@inproceedings{poveyexactlattice,
author = {Povey, Daniel and Mirko Hannemann and Gilles Boulianne and Luk�� Burget and Ghoshal, Arnab and Milo� Janda and Martin Karafi�t and Stefan Kombrink and Petr Motl�ček and Yanmin Qian and Korbinian Riedhammer and Karel Vesel� and Thang Vu},
title = {Generating Exact Lattices in The WFST Framework},
booktitle = {Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing},
publisher = {IEEE Signal Processing Society},
pages = {4213--4216}
}

Estimating Classifier Performance in Unknown Noise
Ehsan Variani and Hynek Hermansky
Proc. Interspeech – 2012

[bib]

@inproceedings{variani-hermansky:is2012,
author = {Variani, Ehsan and Hermansky, Hynek},
title = {Estimating Classifier Performance in Unknown Noise},
booktitle = {Proc. Interspeech},
address = {Portland, Oregon}
}

Data-driven Posterior Features for Low Resource Speech Recognition Applications
Samuel Thomas, Sriram Ganapathy, Aren Jansen and Hynek Hermansky
Proc. of INTERSPEECH – 2012

[bib]

@inproceedings{thomas-ganapthy-jansen-hermansky:is2012,
author = {Thomas, Samuel and Ganapathy, Sriram and Jansen, Aren and Hermansky, Hynek},
title = {Data-driven Posterior Features for Low Resource Speech Recognition Applications},
booktitle = {Proc. of INTERSPEECH},
address = {Portland, Oregon}
}

Phone recognition in critical bands using sub-band temporal modulations
Feipeng Li, Sri Harish Mallidi and Hynek Hermansky
Proc. Interspeech – 2012

[bib]

@inproceedings{li-mallidi-hermansky:is2012,
author = {Feipeng Li and Mallidi, Sri Harish and Hermansky, Hynek},
title = {Phone recognition in critical bands using sub-band temporal modulations},
booktitle = {Proc. Interspeech},
address = {Portland, Oregon}
}

Analysis of Temporal Resolution in Frequency Domain Linear Prediction
Sriram Ganapathy and Hynek Hermansky
Proc. of INTERSPEECH – 2012

[bib]

@inproceedings{ganapathy-hermansky:is2012,
author = {Ganapathy, Sriram and Hermansky, Hynek},
title = {Analysis of Temporal Resolution in Frequency Domain Linear Prediction},
booktitle = {Proc. of INTERSPEECH},
address = {Portland, Oregon}
}

Multilingual MLP Features For Low-Resource LVCSR Systems
Samuel Thomas, Sriram Ganapathy and Hynek Hermansky
Proc. ICASSP – 2012

[bib]

@inproceedings{thomas-ganapathy-hermansky:icassp2012,
author = {Thomas, Samuel and Ganapathy, Sriram and Hermansky, Hynek},
title = {Multilingual MLP Features For Low-Resource LVCSR Systems},
booktitle = {Proc. ICASSP},
address = {Kyoto, Japan}
}

The UMD-JHU 2011 Speaker Recognition System
Daniel Garcia-Romero, Xinhui Zhou, Dmitry Zotkin, Balaji Srinivasan, Xiaoqiang Luo, Sriram Ganapathy, Samuel Thomas, Sridhar Krishna Nemala, Sri Garimella, Majid Mirbagheri, Sri Harish Mallidi, Thomas Janu, Padmanabhan Rajan, Nima Mesgarani, Mounya Elhilali, Hynek Hermansky, Shihab Shamma and Ramani Duraiswami
Proc. ICASSP – 2012

[bib]

@inproceedings{romero-zhou-zotkin-srinivasan-luo-ganapathy-thomas-nemala-garimella-mirbagheri-mallidi-janu-rajan-mesgarani-elhilali-hermansky-shamma-duraiswami:icassp2012,
author = {Garcia-Romero, Daniel and Xinhui Zhou and Dmitry Zotkin and Balaji Srinivasan and Luo, Xiaoqiang and Ganapathy, Sriram and Thomas, Samuel and Nemala, Sridhar Krishna and Garimella, Sri and Majid Mirbagheri and Mallidi, Sri Harish and Thomas Janu and Padmanabhan Rajan and Nima Mesgarani and Elhilali, Mounya and Hermansky, Hynek and Shihab Shamma and Ramani Duraiswami},
title = {The UMD-JHU 2011 Speaker Recognition System},
booktitle = {Proc. ICASSP},
address = {Kyoto, Japan}
}

Acoustic and Data-driven Features for Robust Speech Activity Detection
Samuel Thomas, Sri Harish Mallidi, Thomas Janu, Hynek Hermansky, Nima Msgarani, Xinhui Zhou, Shihab Shamma, Tim Ng, Bing Zhang, Long Nguyen and Spyros Matsoukas
Proc. of INTERSPEECH – 2012

[bib]

@inproceedings{thomas-mallidi-janu-hermansky-mesgarani-zhou-shamma-ng-zhang-nguyen-matsoukas:is2012,
author = {Thomas, Samuel and Mallidi, Sri Harish and Thomas Janu and Hermansky, Hynek and Nima Msgarani and Xinhui Zhou and Shihab Shamma and Tim Ng and Bing Zhang and Long Nguyen and Spyros Matsoukas},
title = {Acoustic and Data-driven Features for Robust Speech Activity Detection},
booktitle = {Proc. of INTERSPEECH},
address = {Portland, Oregon}
}

DIRAC: Detection and identification of rare audio-visual events
J. Anemuller, B. Caputo, Hynek Hermansky, F. Ohl, T. Pajda, M. Pavel, L. Gool, R. Vogels, S. Wabnik and D. Weinshall
Studies in Computational Intelligence 384: 3-35, Anemuller ,Weinshall, Van Gool, Eds. Springer – 2012

[bib]

@article{anemuller-caputo-hermansky-ohl-pajda-pavel-gool-vogels-wabnik-weinshall:sci2012,
author = {J. Anemuller and B. Caputo and Hermansky, Hynek and F. Ohl and T. Pajda and M. Pavel and L. Gool and R. Vogels and S. Wabnik and D. Weinshall},
title = {DIRAC: Detection and identification of rare audio-visual events}
}

Back to Top

2011

Generating More Specific Questions
Xuchen Yao
AAAI Symposium on Question Generation – 2011

[bib]

@inproceedings{Yao2011QG,
author = {Yao, Xuchen},
title = {Generating More Specific Questions},
booktitle = {AAAI Symposium on Question Generation},
address = {Arlington, VA},
url = {http://cs.jhu.edu/~xuchen/paper/Yao2011QG.pdf}
}

NADA: A Robust System for Non-Referential Pronoun Detection
Shane Bergsma and David Yarowsky
Proc. DAARC – 2011

[bib]

@inproceedings{Bergsma:11,
author = {Bergsma, Shane and Yarowsky, David},
title = {NADA: A Robust System for Non-Referential Pronoun Detection},
booktitle = {Proc. DAARC},
address = {Faro, Portugal}
}

Speech recognition from spectral dynamics, Invited Paper
Hynek Hermansky
SADHANA, Indian Academy of Sciences, Vol. 36, Part 5 – 2011

[bib]

@article{hermansky:ias2011,
author = {Hermansky, Hynek},
title = {Speech recognition from spectral dynamics, Invited Paper},
pages = {729--744}
}

Arabic Optical Character Recognition (OCR) Evaluation in Order to Develop a Post-OCR Module
Brian Kjersten
2011

[abstract] [details] [bib]

Abstract

Optical character recognition (OCR) is the process of converting an image of a document into text. While progress in OCR research has enabled low error rates for English text in low-noise images, performance is still poor for noisy images and documents in other languages. We intend to create a post-OCR processing module for noisy Arabic documents which can correct OCR errors before passing the resulting Arabic text to a translation system. To this end, we are evaluating an Arabic-script OCR engine on documents with the same content but varying levels of image quality. We have found that OCR text accuracy can be improved with different stages of pre-OCR image processing: (1) filtering out low-contrast images to avoid hallucination of characters, (2) removing marks from images with cleanup software to prevent their misrecognition, and (3) zoning multi-column images with segmentation software to enable recognition of all zones. The specific errors observed in OCR will form the basis of training data for our post-OCR correction module.

Additional Information

Memorandum Report number ARL-MR-0798

@techreport{kjersten2011,
author = {Kjersten, Brian},
title = {Arabic Optical Character Recognition (OCR) Evaluation in Order to Develop a Post-OCR Module},
publisher = {Army Research Laboratory},
pages = {1 - 18},
url = {http://www.stormingmedia.us/56/5644/A564455.html}
}

Using Visual Information to Predict Lexical Preference
Shane Bergsma and Randy Goebel
Proc. RANLP – 2011

[bib]

@inproceedings{Bergsma:11,
author = {Bergsma, Shane and Randy Goebel},
title = {Using Visual Information to Predict Lexical Preference},
booktitle = {Proc. RANLP},
address = {Hissar, Bulgaria},
pages = {399--405}
}

Event Selection from Phone Posteriorgrams Using Matched Filters
Keith Kintzley, Aren Jansen and Hynek Hermansky
Proc. of INTERSPEECH – 2011

[abstract] [bib]

Abstract

In this paper we address the issue of how to select a minimal set of phonetic events from a phone posteriorgram while minimizing the loss of information. We derive phone posteriorgrams from two sources, Gaussian mixture models and sparse multilayer perceptrons, and apply phone-specific matched filters to the posteriorgrams to yield a smaller set of phonetic events. We introduce a mutual information based performance measure to compare phonetic event selection techniques and demonstrate that events extracted using matched filters can reduce input data while significantly improving performance of an event-based keyword spotting system
@inproceedings{kintzley-jansen-hermansky:is2011,
author = {Kintzley, Keith and Jansen, Aren and Hermansky, Hynek},
title = {Event Selection from Phone Posteriorgrams Using Matched Filters},
address = {Florence, Italy},
pages = {1905--1908},
url = {http://old-site.clsp.jhu.edu/~ajansen/papers/IS2011c.pdf}
}

Feature-Rich Language-Independent Syntax-Based Alignment for Statistical Machine Translation
Jason Riesa, Ann Irvine and Daniel Marcu
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP) – 2011

[abstract] [bib]

Abstract

We present an accurate word alignment algorithm that heavily exploits source and target-language syntax. Using a discriminative framework and an efficient bottom-up search algorithm, we train a model of hundreds of thousands of syntactic features. Out new model (1) helps us to very accurately model syntactic transformations between languages; (2) is language-independent; and (3) with automatic feature extraction, assists system developers in obtaining good word-alignment performance off-the-shelf when tackling new language pairs. We analyze the impact of our features, describe inference under the model, and demonstrate significant alignment and translation quality improvements over already-powerful baselines trained on very large corpora. We observe translation quality improvements corresponding to 1.0 and 1.3 BLEU for Arabic-English and Chinese-English, respectively.
@inproceedings{riesa-irvine-marcu:2011:EMNLP,
author = {Jason Riesa and Irvine, Ann and Daniel Marcu},
title = {Feature-Rich Language-Independent Syntax-Based Alignment for Statistical Machine Translation},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {497--507},
url = {http://dl.acm.org/citation.cfm?id=2145432.2145490}
}

Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
Juri Ganitkevitch, Chris Callison-Burch, Courtney Napoles and Benjamin Van Durme
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing – 2011

[abstract] [bib]

Abstract

Previous work has shown that high quality phrasal paraphrases can be extracted from bilingual parallel corpora. However, it is not clear whether bitexts are an appropriate resource for extracting more sophisticated sentential paraphrases, which are more obviously learnable from monolingual parallel corpora. We extend bilingual paraphrase extraction to syntactic paraphrases and demonstrate its ability to learn a variety of general paraphrastic transformations, including passivization, dative shift, and topicalization. We discuss how our model can be adapted to many text generation tasks by augmenting its feature set, development data, and parameter estimation routine. We illustrate this adaptation by using our paraphrase model for the task of sentence compression and achieve results competitive with state-of-the-art compression systems.
@inproceedings{ganitkevitch-EtAl:2011:EMNLP,
author = {Ganitkevitch, Juri and Callison-Burch, Chris and Napoles, Courtney and Van Durme, Benjamin},
title = {Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {1168-1179},
url = {http://www.aclweb.org/anthology/D11-1108}
}

Joshua 3.0: Syntax-based Machine Translation with the Thrax Grammar Extractor
Jonathan Weese, Juri Ganitkevitch, Chris Callison-Burch, Matt Post and Adam Lopez
Proceedings of the Sixth Workshop on Statistical Machine Translation – 2011

[abstract] [bib]

Abstract

We present progress on Joshua, an open source decoder for hierarchical and syntax-based machine translation. The main focus is describing Thrax, a flexible, open source synchronous context-free grammar extractor. Thrax extracts both hierarchical (Chiang, 2007) and syntax-augmented machine translation (Zollmann and Venugopal, 2006) grammars. It is built on Apache Hadoop for efficient distributed performance, and can easily be extended with support for new grammars, feature functions, and output formats.
@inproceedings{weese-EtAl:2011:WMT,
author = {Weese, Jonathan and Ganitkevitch, Juri and Callison-Burch, Chris and Post, Matt and Lopez, Adam},
title = {Joshua 3.0: Syntax-based Machine Translation with the Thrax Grammar Extractor},
booktitle = {Proceedings of the Sixth Workshop on Statistical Machine Translation},
address = {Edinburgh, Scotland},
publisher = {Association for Computational Linguistics},
pages = {478--484},
url = {http://www.aclweb.org/anthology/W11-2160}
}

Findings of the 2011 Workshop on Statistical Machine Translation
Chris Callison-Burch, Philipp Koehn, Christof Monz and Omar Zaidan
Proceedings of the Sixth Workshop on Statistical Machine Translation – 2011

[abstract] [bib]

Abstract

This paper presents the results of the WMT11 shared tasks, which included a translation task, a system combination task, and a task for machine translation evaluation metrics. We conducted a large-scale manual evaluation of 148 machine translation systems and 41 system combination entries. We used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality for 21 evaluation metrics. This year featured a Haitian Creole to English task translating SMS messages sent to an emergency response service in the aftermath of the Haitian earthquake. We also conducted a pilot ‘tunable metrics’ task to test whether optimizing a fixed system to different metrics would result in perceptibly different translation quality.
@inproceedings{callisonburch-EtAl:2011:WMT,
author = {Callison-Burch, Chris and Philipp Koehn and Christof Monz and Zaidan, Omar},
title = {Findings of the 2011 Workshop on Statistical Machine Translation},
booktitle = {Proceedings of the Sixth Workshop on Statistical Machine Translation},
address = {Edinburgh, Scotland},
publisher = {Association for Computational Linguistics},
pages = {22--64},
url = {http://www.aclweb.org/anthology/W11-2103}
}

You Are What You Tweet : Analyzing Twitter for Public Health
Michael Paul and Mark Dredze
5th Interational Conference on Weblogs and Social Media – 2011

[bib]

@inproceedings{citeulike:9834165,
author = {Paul, Michael and Dredze, Mark},
title = {You Are What You Tweet : Analyzing Twitter for Public Health},
booktitle = {5th Interational Conference on Weblogs and Social Media},
publisher = {AAAI Press},
pages = {265--272},
url = {http://www.cs.jhu.edu/~mpaul/files/2011.icwsm.twitter_health.pdf}
}

Learning Bilingual Lexicons using the Visual Similarity of Labeled Web Images
Shane Bergsma and Benjamin Van Durme
Proc. IJCAI – 2011

[bib]

@inproceedings{Bergsma:11,
author = {Bergsma, Shane and Van Durme, Benjamin},
title = {Learning Bilingual Lexicons using the Visual Similarity of Labeled Web Images},
booktitle = {Proc. IJCAI},
address = {Barcelona, Spain},
pages = {1764--1769}
}

Reranking Bilingually Extracted Paraphrases Using Monolingual Distributional Similarity
Charley Chan, Chris Callison-Burch and Benjamin Van Durme
Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics – 2011

[abstract] [bib]

Abstract

This paper improves an existing bilingual paraphrase extraction technique using monolingual distributional similarity to rerank candidate paraphrases. Raw monolingual data provides a complementary and orthogonal source of information that lessens the commonly observed errors in bilingual pivot-based methods. Our experiments reveal that monolingual scoring of bilingually extracted paraphrases has a significantly stronger correlation with human judgment for grammaticality than the probabilities assigned by the bilingual pivoting method does. The results also show that monolingual distribution similarity can serve as a threshold for high precision paraphrase selection.
@inproceedings{chan-callisonburch-vandurme:2011:GEMS,
author = {Chan, Charley and Callison-Burch, Chris and Van Durme, Benjamin},
title = {Reranking Bilingually Extracted Paraphrases Using Monolingual Distributional Similarity},
booktitle = {Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics},
address = {Edinburgh, UK},
publisher = {Association for Computational Linguistics},
pages = {33--42},
url = {http://www.aclweb.org/anthology/W11-2504}
}

Back to Top

Displaying 1 - 100 of 682 total matches