Collective Supervision of Topic Models for Predicting Surveys with Social Media

See below for the tokenized tweet data used in “Collective Supervision of Topic Models for Predicting Surveys with Social Media”. AAAI ’16. Please respect the Twitter terms for service, and download no more than one of these files each day (50K tweets). For code to train the topic models, see

https://bitbucket.org/adrianbenton/sprite/

 

Tweet data:

 

If you just want the tweet IDs and description of data, see

https://github.com/abenton/collsuptmdata

 

If you end up using these data, please cite:

Adrian Benton, Michael J. Paul, Braden Hancock, Mark Dredze.
Collective Supervision of Topic Models for Predicting Surveys with Social Media.
Thirtieth AAAI Conference on Artificial Intelligence, 2016.

 

Center for Language and Speech Processing