520.666 Home-page Untitled Home

Information Extraction from Speech and Text
Course webpage

Projects

Project 1

Project 2

Project 3

Project 3
prob3treesimple2(pg 8)
prob3tree(pg 9)
prob3treeentropy2(pg 10)
prob3treeentropy2(pg 10)
prob3treegini2(pg 11)
In each of the above figures, the first row is the third previous history,the second row is the second previous history and the third row is the previous letter.The final nodes are the nodes at which there is almost no Gini/Entropy reduction.The detected letters are simply the letters which will be detected by the decision tree if a shannon game were played on a data set(they are the letters with maximum likelihood).

Assignments/Homeworks

Homework 1
Homework 2
Homework 3
Homework 4
Homework 5
Homework 6
Homework 7
Homework 8

Homework 9
Code for Heldout
Code for Good Turing
File containing the unigram(729 alphabets) estimate from good Turing(required for katz2.c)
Code Implementing Katz backoff for bigram and trigram models
training Set
test Set