-
USPTO Patent data
Linked Data version of the US Patent and Trademark Office (USPTO) data. Number of triples: 212,234,735. Number of resources: 3,215,768 Links to other datasets: DBpedia,... -
DBpedia abstract corpus
This corpus contains a conversion of Wikipedia abstracts in six languages (dutch, english, french, german, italian and spanish) into the I used the NLP Interchange Format (NIF).... -
LODStats
LODStats: The Data Web Census Dataset. -
SemanticQuran
The Semantic Quran dataset is a multilingual RDF representation of translations of the Quran. The dataset was created by integrating data from two different semi-structured... -
GWPP Glossary
The GWPP glossary is a set of scientific terms and their definitions that are used inside the Global Water Pathogen Project online book. This dataset is crowdsourced by a large... -
Lidioms
the LIDIOM dataset is a multilingual RDF representation of idioms containing five languages. The data set was crawled and integrated from various sources. For assuring the... -
LinkLion - A Link Repository for the Web of Data
LinkLion is an open-source central repository for the storage of links among resources in the Linked Open Data web. The main goal of LinkLion is to facilitate the publication,... -
Linked TCGA
Linked TCGA is the RDF version of the Cancer Genome Atlas, a pilot project started in 2005 by the National Cancer Institute (NCI) and the National Human Genome Research... -
JRC-Names-MLODE
From their web site: JRC-Names is a highly multilingual named entity resource for person and organisation names (called 'entities'). It consists of large lists of names and... -
Caucasian Spiders
The Caucasian Spiders Database aims at containing all records (published occurrences) of spiders (Araneae) in the Caucasus Ecoregion (the rayons Krasnodar and Stavropol in... -
CORDIS corpus
CORDIS (Community Research and Development Information Service), is the European Commission’s core public repository providing dissemination information for all EU-funded... -
CORDIS
todo -
aksw.org Research Group dataset
This dataset contains projects, sub groups, people and pages or the Agile Knowledge Management and Semantic Web (AKSW) Research Group @ University of Leipzig. -
KORE 50 NIF NER Corpus
KORE 50[1] (AIDA) is a subset of the larger AIDA corpus, which is based on the dataset of the CoNLL 2003 NER task. The dataset aims to capture hard to disambiguate mentions of... -
ORCID
ORCID (Open Researcher and Contributor ID) is a nonproprietary alphanumeric code to uniquely identify scientific and other academic authors. This dataset contains RDF conversion... -
Statbel Corpus
This corpus contains RDF conversion of datasets from the "Statistics Belgium" (also known as Statbel) which aims at collecting, processing and disseminating relevant, reliable... -
Global airports in RDF
This corpus contains RDF conversion of Global airports dataset which was retrieved from openflights.org. The dataset contains information about airport names, its location,... -
Lion's Den
Lion's Den is a RDF repository of link specifications. Lion's Den is intended to be an open community-driven dataset that allows data publishers to also publish their... -
LSQ
Linked SQ: a Linked Dataset describing SPARQL queries extracted from the logs of a variety of prominent public SPARQL endpoints. We argue that this dataset has a variety of uses... -
Brown Corpus in RDF/NIF
RDF version of the Brown Corpus (W. N. Francis, H. Kucera; Brown University; 1979). 1,014,312 words in 500 documents, taken from newspapers texts on diverse topics, non-fiction...