Billion Triples Challenge Dataset 2010

Dataset that was used for the Billion Triples Challenge 2010:


The major part of the dataset was crawled from the Web of Linked Data during March/April 2010 based on datasets provided by Falcon-S, Sindice, Swoogle, SWSE, and Watson using the MultiCrawler/SWSE framework. We also included partial data from and

The downloaded content was parsed using the Redland toolkit with the rdfxml parser. We rewrote blank node identifiers to include the data source in order to provide unique blank nodes for each data source, and appended the data source to the output file. The data is encoded in NQuads format and split into chunks of 10m statements each.

The datasets of the Billion Triples Challenges 2008 and 2009 are also still available.

Data and Resources

Additional Info

Field Value
Author Andreas Harth
Maintainer Andreas Harth
Version 2010
Last Updated October 10, 2013, 19:58 (UTC)
Created September 9, 2010, 09:57 (UTC)
triples 3200000000
comments powered by Disqus
comments powered by Disqus