-
gemet-annotated
Details about how this dataset was built are described in the article: Are SKOS concept schemes ready for multilingual retrieval applications? — Diana Tanase and Epaminondas... -
Leipzig Corpora Collection (LCC)
Deutscher Wortschatz contains data generated from newspapers and web resources that are publicly available. The data were collected per language and encompass statistics about... -
Automated Similarity Judgment Program lexical data
ASJP collects 40 words from 5500 languages in a simplified phonetic representation. More background can be found at http://email.eva.mpg.de/~wichmann/ASJPHomePage.htm -
Atlante Sintattico d'Italia (ASIt)
The Atlante Sintattico d'Italia, Syntactic Atlas of Italy (ASIt) enterprise builds on a long standing tradition of collecting and analysing linguistic corpora, which has...