Workshop Homepage
Workshop 2000
This file describes the use of a generic alignment tool (based on the FSM toolkit provided by AT&T) that was developed at the JHU 2000 Summer Workshop. We assume the reader has access to unix. The data listed at the end of this page were generated by the same command lines listed here.

The command line that does the alignment is
> trans2fsm_align.prl -q symb.syms cost.fst ref.trans hyp.trans > aligned_results
-q : Quiet mode
symbs.syms - the symbols list ( eg. phones)
cost.fst - a transducer that gives the cost between two symbols
ref.trans - the reference transcription in NIST sclite trans format
hyp.trans - the hypothesis transcription in NIST sclite trans format
 

Description of the formats:
symbs.syms looks like
eps       0
sil          1
sil_v    2
sil_u    3
b            4
b_v      5
b_u      6
p            7
p_v      8
[snip]
(NOTE: the symbol corresponding to "0" is the deletion symbol)

cost.fst - a cost transducer that gives the cost between two symbols. This has to generated by the AT&T toolkit. It can be generated by
> fsmcompile -i symb.syms -o symb.syms -t cost.stxt > cost.fst
Here cost.stxt looks like
[snip]
0   0 a   ao_u     5
0   0 a   ao_v     5
0   0 a   b           13
0   0 a   b_u     14
0   0 a   b_v     14
[snip]
The first two columns give the state numbers; the third gives the input symbol; the fourth the output symbol; the last column gives the cost of substituting the third with the fourth symbol.

The method of arriving at these costs is task-dependent. For the workshop, the following method was followed:
A classes file is created to categorize the symbols into several classes (the classes do not form a fuction, ie its not one to one). The idea is that symbols that have similar properties should have lesser costs. The program makefiles.pl goes through these classes one by one and then considers every pair of symbols. It increments the cost between two symbols if one of them does not belong to the class. If the pair belong to the class, the cost is not incremented. (One drawback of this is that the costs become dependent on the number of categories used).
The classes file used is as shown below: to calculate the cost for symbols with eg.,
Final                     a ai an ang ao e ei en eng er o ong ou i i1 i2 ia
Initial                    b p m f d t n l g k h j q x z c s zh ch sh r
I_UnAspStop    b d g
I_AspStop           p t k

Finals is a property shared by the phones listed against it, etc. The costs listed below are only to indicate the idea used.
Here the cost c(a,b)=3; c(b,d)=0;c(b,p)=1, etc.

Also, another file, called rules.txt is used to fix certain costs and to allow for contractions and expansions All the above files can be generated by the following command line
> makefsmfiles.pl newsyllclasses rules.txt cost.stxt symb.syms
 

The alignment tool creates several temporary files in the /tmp directory. (These are deleted after the processing of each utterance). The directory can be changed by editing the program.
The various tools and example data files that were mentioned in this page can be downloaded here:

A single .tar.gz file: jhu.clsp.ws2000.tar.gz (Around 200KB)
As individual uncompressed files:
trans2fsm_align.pl
symb.syms
cost.fst
cost.stxt
ref.trans
hyp.trans
aligned_results
rules.txt
newsyllclasses
makefsmfiles.pl

The AT&T fsmtools are available for download under conditions at http://www.research.att.com/sw/tools/fsm/