Search for a Dataset - the Datahub

Add Dataset Import Data Package

USAGE review corpus

This corpus consists of sentiment annotations of Amazon reviews for different product categories in the languages German and English. The reviews themselves are not part of this...
- example
- text/ntriples
- RDF
- api/sparql
Linguistic Metadata (LIME) vocabulary

LIME (LInguistic MEtadata) is a vocabulary for expressing linguistic metadata about linguistic resources and linguistically grounded datasets. The metadata vocabulary has been...
- HTML
- RDF
IWN

This is the dataset corresponding to the ItalWordNet as created at the Institute of Computational Linguistic "A. Zampolli" in Pisa. The resource contains single instances such...
- RDF
- tar.gz
SIMPLE

This dataset contains the conversion of the Italian SIMPLE lexicon in different formats including RDF, TTL and a Lemon version of lexical entries with their pointers to senses.
- RDF
- JSON
- TXT
- text/turtle
gemet-annotated

Details about how this dataset was built are described in the article: Are SKOS concept schemes ready for multilingual retrieval applications? — Diana Tanase and Epaminondas...
- RDF
Leipzig Corpora Collection (LCC)

Deutscher Wortschatz contains data generated from newspapers and web resources that are publicly available. The data were collected per language and encompass statistics about...
Automated Similarity Judgment Program lexical data

ASJP collects 40 words from 5500 languages in a simplified phonetic representation. More background can be found at http://email.eva.mpg.de/~wichmann/ASJPHomePage.htm
Atlante Sintattico d'Italia (ASIt)

The Atlante Sintattico d'Italia, Syntactic Atlas of Italy (ASIt) enterprise builds on a long standing tradition of collecting and analysing linguistic corpora, which has...
- RDF
- XML

You can also access this registry using the API (see API Docs).

8 datasets found