Linked Open Data Integration Benchmark (LODIB) Specification - V1.0

Carlos R. Rivero (University of Sevilla, Spain)
Andreas Schultz (Freie Universität Berlin, Germany)
Chris Bizer (Freie Universität Berlin, Germany)
This version:
Latest version:
Publication Date: 02/20/2012


Linked Data sources are growing steadily since 2007, which rely on semantic-web technologies to publish, connect and query their data. The vast majority of Linked Data sources define or extend their own vocabulary to publish their data, which entails that there exists a heterogeneity among vocabularies in this context. To solve these problems, mappings are used to perform data translation, i.e., exchanging data from the source data model to the target data model. In the bibliography, these mappings are called executable mappings, since they consist of queries that are executed over a source and they translate the data into a target, and there exist a number of approaches to define and automatically generate them. Therefore, a benchmark to test data translation systems becomes more and more important. In the bibliography, as far as we know, there exist two benchmark to test such systems: one focuses on nested relational models, which makes it difficult to extrapolate to the context of Linked Data, and the other focus on completely synthetic scenarios without taking real-world Linked Data sources into account. In this document, we present LODIB (Linked Open Data Integration Benchmark), a benchmark to test s when performing data translation in the context of Linked Data sources that provides a catalogue of fifteen data translation patterns based on real-world problems in the Linked Data context. LODIB is able to measure recall, time performance, and expressivity of data translation systems. Finally, it provides a synthetic data generator that allows to scale source data.

Table of Contents