CLSP Logo

Finite State Revisited

Fernando Pereira
AT&T Research

Finite automata are usually taken for granted in language processing. The theory is seen as well understood and unchanging; the capabilities and limitations of tools such as lex, grep and regular expression packages are accepted without much reflection. Yet, most language-language processing applications do not involve a single analysis stage describable by a finite-state acceptor, but rather a succession of transformations involving increasingly abstract representations. Furthermore, they often involve alternative transformation rules with different costs that need to be combined and compared appropriately. Most current language processing systems handle composition of transformations and transformation costs with "ad hoc" specialized algorithms that are difficult to modify or generalize.

There is a better way. By moving from acceptors to transducers, we can represent mappings between sequences and their composition. By adding weights to automata transitions, we can represent the costs of alternative mappings, and combine they appropriately. The two extensions are compatible, and are backed up by the relatively little-known but very well developed theory of rational power series. Weighted finite-state transducers are not only a convenient representation for a variety of language processing tasks, but can also be optimized in ways that were inconceivable with the earlier "ad hoc" representations.

I will explain the main concepts and algorithms for weighted transducers using examples mainly from our speech recognition work, but also from text analysis and tagging. Much of the material is joint work with my colleagues Merhyar Mohri and Michael Riley, and has been implemented as a program library http://www.research.att.com/sw/tools/fsm.