Collective Supervision of Topic Models for Predicting Surveys with Social Media

See below for the tokenized tweet data used in “Collective Supervision of Topic Models for Predicting Surveys with Social Media”. AAAI ’16. Please respect the Twitter terms for service, and download no more than one of these files each day (50K tweets). For code to train the topic models, see

https://bitbucket.org/adrianbenton/sprite/

Tweet data:

input.guncontrol.1.txt
input.guncontrol.2.txt
input.vaccine.1.txt
input.vaccine.2.txt
input.tobacco.1.txt
input.tobacco.2.txt

If you just want the tweet IDs and description of data, see

https://github.com/abenton/collsuptmdata

If you end up using these data, please cite:

Adrian Benton, Michael J. Paul, Braden Hancock, Mark Dredze.
Collective Supervision of Topic Models for Predicting Surveys with Social Media.
Thirtieth AAAI Conference on Artificial Intelligence, 2016.

Collective Supervision of Topic Models for Predicting Surveys with Social Media

People

Upcoming Seminars

Center for Language and Speech Processing