the LIDIOM dataset is a multilingual RDF representation of idioms containing five languages. The data set was crawled and integrated from various sources. For assuring the quality of the crawled data, all idioms were evaluated by at least two native speakers. We designed the dataset to be easily usable in natural-language processing applications with the goal of facilitating the translation content task. We present the ontology devised for structuring the data. We also provide the transformation rules implemented in our extraction framework. In particular, the dataset uses the best practices in accordance with Linguistic Linked Open Data Community (LLOD). We also detail the link creation process as well as possible usage scenarios for the linked idioms data set.

Data and Resources

Additional Info

Field Value
Author Diego Moussallem
Maintainer Diego Moussallem
Version 1.0
Last Updated December 28, 2016, 19:23 (UTC)
Created February 12, 2016, 21:51 (UTC)
comments powered by Disqus
comments powered by Disqus