TweetsKB - A Public and Large-Scale RDF Corpus of Annotated Tweets

TweetsKB is a public RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for more than 1.5 billion tweets, spanning almost 5 years (January 2013 - November 2017). Metadata information about the tweets as well as extracted entities, sentiments, hashtags and user mentions are exposed in RDF using established RDF/S vocabularies. For the sake of privacy, we anonymize the usernames and we do not provide the text of the tweets. However, through the tweet IDs, actual tweet content and further information can be fetched.

Links to all parts:

Sample files, example queries and more information are available through TweetsKB's home page: http://l3s.de/tweetsKB/.

Data and Resources

Additional Info

Field Value
Source http://l3s.de/tweetsKB/
Author Pavlos Fafalios
Maintainer Pavlos Fafalios
Version 1.0
Last Updated December 13, 2017, 14:41 (UTC)
Created December 13, 2017, 14:14 (UTC)