WS 99 TDT Resources

Where to find things we have installed locally:

Data

All corpora are located under /export/tdt/ws99/data:
NID data we have so far:
tdt2-novtag-train/

.nov files are SGML data from LDC, .vecs files are input format for the NID software.

The entire TDT2 corpus:
/export/tdt/ws99/data/tdt2/

Judgments of topic relevance for 100 topics form TDT2:
/export/tdt/ws99/data/tdt2.rel1

Named Entity tagging for TDT2 from BBN:
/export/tdt/ws99/data/tdt2-NEtag

Other corpora that might useful as additional data:
LA-Times98, SDR99-Newswire

Tools

Software is located under /export/tdt/ws99/tools/:
Charniak statistical parser:
CharniakParser_v4

Preliminary FSD program:
tdt/nid The binary to run: /export/tdt/ws99/tools/tdt/nid/BIN-linux/nid. This takes -task fsd or -task nid

A nice wrapper script to run the NIST eval program on the output of the nid program.
/export/tdt/ws99/misc/tdt-lab/fsdeval

NIST FSD evluation software:
TDT3eval_v1.2

UMass FSD evluation software:
umassEval

Miscellany:

Useful Links