OpenFst: a General and Efficient Weighted Finite-State Transducer Library – Michael Riley (Google)

October 16, 2007 all-day

View Seminar Video
We describe OpenFst, an open-source library for weighted finite-state transducers (WFSTs). OpenFst consists of a C++ template library with efficient WFST representations and over twenty-five operations for constructing, combining, optimizing, and searching them. At the shell-command level, there are corresponding transducer file representations and programs that operate on them. OpenFst is designed to be both very efficient in time and space and to scale to very large problems. This library has key applications speech, image, and natural language processing, pattern and string matching, and machine learning. We give an overview of the library, including an outline of some key algorithms, examples of its use, details of its design that allow customizing the labels, states, and weights, and the lazy evaluation of many of its operations. Further information and a download of the OpenFst library can be obtained from the OpenFst web site. Joint work with: Cyril Allauzen, Johan Schalkwyk, Wojtek Skut and Mehryar Mohri.

Michael Riley received his B.S., M.S. and Ph.D. in computer science from MIT. He joined Bell Labs in Murray Hill, NJ in 1987 and moved to AT&T Labs in Florham Park, NJ in 1996. He is currently a member of the research staff at Google, Inc. in New York City. His interests include speech and natural language processing, text analysis, information retrieval, and machine learning.

Center for Language and Speech Processing