-
Open Multilingual Wordnet
Documentation of and links to data for wordnets in 20 languages (Albanian, Arabic, Danish, English, Persian, Finnish, French, Hebrew, Italian, Japanese, Basque, Catalan,... -
KAIST silver standard corpus
KAIST silver standard corpus Availability: Freely Avalable Usage: Named Entity Recognition Status:Newly created-finished Description: We propose a novel method to... -
PanLex
A lexical database documenting translations among lexemes of language varieties. -
xLiD-Lexica
Our xLiD-Lexica dataset in RDF (http://km.aifb.kit.edu/resources/xLiD-lexica.nt) contains about 300 million triples of cross-lingual groundings. It is extracted from Wikipedia... -
Syntactic Reference Corpus of Medieval French (SRCMF)
The SRCMF contains the 15 Old French texts with about 280000 words. It has a high-quality manual annotation, based on a linguistically adequate dependency grammar. Annotation... -
OLiA Discourse
OLiA Discourse Extensions -
linked hypernyms
This Linked Hypernym dataset attaches entity articles in English, German and Dutch Wikipedia with a DBpedia resource or a DBpedia ontology concept as their type. The types are... -
ISOcat-metadata
The linguistics community is building a metadata-based infrastructure for the description of its research data and tools. At its core is the ISOcat registry ISOcat.org, a... -
Phonetics Information Base and Lexicon (PHOIBLE)
Phonetics Information Base and Lexicon (PHOIBLE) is a data set of phonological inventories with additional linguistic and non-linguistic information. -
Linked Old Germanic Dictionaries
Lexical resources (word lists, etymological dictionaries) for Germanic languages in different historical stages: pre 1100 (incl. Gothic, Old High German, Old English),...