Uni Mannheim Logo

The R2R Framework

Translating RDF data from the Web to a target vocabulary
Andreas Schultz
Chris Bizer
Robert Isele

 

Data is represented on the Web of Linked Data using terms from a wide range of different vocabularies. The R2R Framework enables Linked Data applications which discover data on the Web, that is represented using unknown terms, to search the Web for mappings and apply the discovered mappings to translate Web data to the application's target vocabulary. The R2R Framework is aimed to be used by Linked Data publishers, vocabulary maintainers and Linked Data application developers. It support them by:

  1. providing the R2R Mapping Language for publishing fine-grained term mappings on the Web
  2. defining best-practices on how mappings can be discovered by Linked Data applications
  3. providing an open-source implementation of the R2R Mapping Engine.

This document gives a short overview of the R2R Framework, describes its installation and configuration and gives several mapping examples.

Download the R2R Java API

v0.2.2 (alpha), released 2011-05-09

News

  • 2013-02-15: First version of the R2R Graphical User Interface released. The R2R Graphical User Interface is a web application that allows the user to execute R2R vocabulary mappings on data sources that are either located in a triple store or in RDF dumps.
  • 2011-06-29: R2R is used in the Linked Data Integration Framework. The Linked Data Integration Framework (LDIF) is a data integration tool for translating heterogeneous Linked Data from the Web into a clean, local target representation while keeping track of data provenance.
  • 2011-05-09: Version 0.2.2 released. Added XPath functions.
  • 2010-07-30: Version 0.2 released. Composition method for chaining partial mappings from different sources based on a mapping quality assessment heuristic added.
  • 2010-04-29: Version 0.1 released. Initial version of the R2R Framework released. 

Contents

  1. About the R2R Framework
  2. Quick start
  3. Mapping Examples
  4. Configuration options
  5. R2R Graphical User Interface
  6. Source code and development
  7. References

1. About the R2R Framework

The promise of the Web of Linked Data is to enable client applications to discover new data sources by following RDF links at run-time and to smoothly integrate data from these sources. Linked Data sources use different vocabularies to describe the same type of objects. It is also common practice to mix terms from different widely used vocabularies with proprietary terms. In contrast, Linked Data applications usually expect data to be represented using a consistent target vocabulary. Thus Linked Data applications need to translate Web data to their local schema before doing any sophisticated data processing. The R2R Framework supports them with this. The framework consists of a mapping language for expressing term correspondences, best-practices on how to publish mappings on the Web and a Java API for transforming data according to these mappings. The syntax of the R2R mapping language is very similar to the query language SPARQL, which eases the learning curve. The mapping language covers value transformation for use cases where RDF datasets use different units of measurement and  can handle one-to-many and many-to-one correspondences between vocabulary elements. The R2R Java API transforms Web data to a given target vocabulary. The R2R Famework can be employed within two use cases:

  1. Closed Use Case: R2R can be used to translate Web data to a target vocabulary based on a fixed set of R2R Mappings. Within this use case, input data can be provided as RDF file, Jena Model or can be retrieved from a SPARQL endpoint. Based on the given set of mappings and a given specification of the target vocabulary, the R2R API selects and combines the relevant mappings and transforms the input data into the target vocabulary.
  2. Global, open Use Case: R2R can also be used in an open, distributed fashion. In this use case, data publishers as well as vocabulary maintainers (are assumed to) publish R2R Mappings on the Web as Linked Data. Linked Data applications, which discover data on the Web that is represented using unknown terms, can search the Web for mappings (Mapping Discovery) and use the R2R API to combine and chain the discovered mappings in order to translate unknown terms to the application-specific target vocabulary.

For details about the R2R Mapping Language, publishing and discovering R2R Mapping on the Web, and on using the R2R Java API see:

  • R2R User Manual: Language Specification, Publishing Vocabulary and Mapping Discovery

2. Quick start

To use the R2R Framework, you need:

  • Java 1.5 or higher to run R2R (check with java -version if you're not sure),

What to do:

  1. Download and extract the R2R archive into a suitable location.

  2. Test your installation with the examples from the Examples section.

3. Mapping Examples

To be able to execute the following examples you need to set up everything as explained under Quick start.

Example 1: vCard, FOAF, DBpedia mappings

This example shows, how an R2R mapping is used to transform RDF data contained in a local file. The target vocabulary definition includes only properties, without a class restriction. The mp:VCardBirthDayMapping is a simple mapping renaming only the property. The mp:VCardEmailToFoafMbox property mapping is also a simple rename-mapping, but the source properties can have various URIs, in this case v:email or v:workEmail. The last mapping mp:concatFirstAndLastNameMapping is the first that uses value transformations. It takes the separated first and last name of the source and concatenates it to one string: 'last name, first name'.

  • R2R mapping file: mappings.ttl
  • Source RDF file: example1_data.ttl
  • Target vocabulary definition:
  • @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    @prefix dbpedia: <http://dbpedia.org/ontology/> .
    @prefix v: <http://www.w3.org/2006/vcard/ns#> .
    @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    (
    foaf:mbox,
    dbpedia:birthDay,
    v:n
    )
  • Mappings from the mapping file that actually get executed:
    • mp:VCardEmailToFoafMbox
    • mp:VCardBirthDayMapping
    • mp:concatFirstAndLastNameMapping
  • Example1 application source code: Example1.java
  • Execution instructions:
    • Change into the example_data directory in the R2R root dir.
    • Under Unix type:
    • ./run de.fuberlin.wiwiss.r2r.examples.Example1
    • Under Windows type:
    • run de.fuberlin.wiwiss.r2r.examples.Example1
  • Output created by running the example application:
    • N-Triple: example1_output.nt

Example 2: Transformation between different units of measurement

This example illustrates how R2R is used to transform property values between different units of measurement. This time we transform a value melting point given in Fahrenheit to a value in Kelvin. The property is also named with a different URI.

  • R2R mapping file: mappings.ttl
  • Source RDF file: example2_data.ttl
  • Target vocabulary definition:
  • @prefix dbpedia: <http://dbpedia.org/ontology/> .
    (
    dbpedia:meltingPoint
    )
  • Mappings from the mapping file that actually get executed:
    • mp:numericTransformationMapping
  • Example2 application source code: Example2.java
  • Execution instructions:
    • Change into the example_data directory in the R2R root dir.
    • Under Unix type:
    • ./run de.fuberlin.wiwiss.r2r.examples.Example2
    • Under Windows type:
    • run de.fuberlin.wiwiss.r2r.examples.Example2
  • Output created by running the example application:
    • N-Triple: example2_output.nt

Example 3: Transform Data provided by DBpedia SPARQL endpoint

This example shows, how an R2R mapping is used to transform RDF data directly from the DBpedia SPARQL endpoint. Here the target instances are restricted to the class foaf:Person - the plus sign behind the class restriction means that the rdf:type statements for this class should also be generated. To make the example more interesting, instead of mapping from dbpedia:name to foaf:name, we map from rdfs:label to foaf:name. Since this is only valid for instances of type Person, this property mapping references the class mapping mp:DBpediaToFoafPersonMapping. By doing this the mp:labelToNameMapping will only pick instances restricted to type foaf:Person, even if no explicit class restriction is given. To show some more transformations, we map to an invented property <http://nodomain/ontology/weight> which displays the original weight given in gram (double value) as a string: 'x lb'.

  • R2R mapping file: mappings.ttl
  • Target vocabulary definition:
  • @prefix foaf: <http://xmlns.com/foaf/0.1/>
    foaf:Person+
    (
    foaf:name,
    <http://nodomain/ontology/weight>
    )
  • Mappings from the mapping file that actually get executed:
    • mp:DBpediaToFoafPersonMapping
    • mp:labelToNameMapping
    • mp:numericToStringMapping
  • Example3 application source code: Example3.java
  • Execution instructions:
    • Change into the example_data directory in the R2R root dir.
    • Under Unix type:
    • ./run de.fuberlin.wiwiss.r2r.examples.Example3
    • Under Windows type:
    • run de.fuberlin.wiwiss.r2r.examples.Example3
  • Output created by running the example application:
    • N-Triple: example3_output.nt

Example 4: Linkedmdb to DBpedia mappings

This example shows that you can import OWL (and also RDFS) mappings. The dbpedia:director property is mapped by an owl:equivalentProperty mapping that is converted to an R2R mapping during the import process. The second mapping reduces a path Actor - Performance - Film  to a direct connection Actor - Film of a dbpedia:starring statement. The data of the source file was extracted from the Linkedmdb SPARQL endpoint.

  • R2R mapping file: mappings.ttl
  • Source RDF file: example4_data.nt
  • Target vocabulary definition:
  • @prefix dbpedia: <http://dbpedia.org/ontology/> .
    (
    dbpedia:starring,
    dbpedia:director
    )
  • Mappings from the mapping file that actually get executed:
    • mp:StarringMapping
    • OWL mapping: movie:director to dbpedia:director
  • Example4 application source code: Example4.java
  • Execution instructions:
    • Change into the example_data directory in the R2R root dir.
    • Under Unix type:
    • ./run de.fuberlin.wiwiss.r2r.examples.Example4
    • Under Windows type:
    • run de.fuberlin.wiwiss.r2r.examples.Example4
  • Output created by running the example application:
    • N-Triple: example4_output.nt

Example 5: Mapping Chains

This example shows how the R2R API builds mapping chains in order to compensate missing mappings. The mapping file consists of mappings between DBpedia and other datasets. For the dbpedia:runtime property a mapping chain from LinkedMDB over Freebase to DBpedia is build. Besides the target vocabulary data itself, the example program also outputs the mapping chain and information the ranking score produced by R2R's mapping quality assessment heuristic. The data of the source file was extracted from the Linkedmdb SPARQL endpoint.

  • R2R mapping file: DBpediaToX.ttl
  • Source RDF file: discoveryExample1_input.n3
  • Target vocabulary definition (Discovery):
  • @prefix dbpedia: <http://dbpedia.org/ontology/> .
    @prefix linkedmdb: <http://data.linkedmdb.org/resource/movie/> .

    (dbpedia:runtime, dbpedia:Film, dbpedia:director)
  • Mappings from the mapping file that actually get executed:
    • mappings:linkedmdbToDBpediaFilm
    • mappings:linkedmdbToDBpediaDirectorProperty
    • mappings:freebaseToDBpediaRuntime
    • mappings:linkedmdbToFreebaseRuntime
  • Example application source code: DiscoveryExample1.java
  • Execution instructions:
    • Change into the example_data directory in the R2R root dir.
    • Under Unix type:
    • ./run de.fuberlin.wiwiss.r2r.examples.DiscoveryExample1
    • Under Windows type:
    • run de.fuberlin.wiwiss.r2r.examples.DiscoveryExample1
  • Output created by running the example application:
    • N-Triple: discoveryExample1_output.nt

4. Configuration options

Logging

The R2R API uses the Apache commons-logging API. An example configuration comes in the form of a log4j.properties file. It defines a File Appender that writes logging messages to the file r2r.log. Erroneous mappings or input data do not cause exceptions that are propagated outside the mapping API. Instead this information is logged on the DEBUG level and thus written to the r2r.log file. If something didn't work out the way you thought, looking into this file would be a good start. If you want these messages displayed on the console do the following change to the log4j.properties file:

Change the following line

log4j.appender.stdout.Threshold=INFO

to

log4j.appender.stdout.Threshold=DEBUG

And don't forget to put the log4j.properties file into the classpath.

Exception Handling

By default the R2R framework throws an exception when encountering an error while creating mappings, executing mappings or loading external functions. If you only want to log these errors instead, you can set the r2r.ExceptionHandling.rethrow property in the r2r.properties file to false. And don't forget to put the r2r.properties file into the classpath.

r2r.FunctionManager=de.fuberlin.wiwiss.r2r.LoadingFunctionManager
r2r.FunctionManager.loadFromURLs=false
r2r.ExceptionHandling.rethrow=false

This will make the R2R API log the error and continue with its work.

5. R2R Graphical User Interface

The R2R Graphical User Interface is a web application that provides a graphical user interface to the R2R Framework. It allows the user to execute vocabulary mappings on data sources that are either located in a triple store or in RDF dumps. The main features of the R2R Graphical User Interface are:

  • The user may select input data from a SPARQL endpoint as well as from RDF dumps.
  • Mappings can be loaded from files and edited in a text-based editor.
  • On executing mappings, the translated RDF data can be written to SPARQL/Update endpoints or written to RDF dumps.

Screenshot

Requirements

  • JDK 6 or later
  • Play Framework 2.0 or later
  • The latest R2R source code

Running the R2R Graphical User Interface

  • Navigate to the r2rgui directory.
  • Execute: play run
  • Navigate in your browser to http://localhost:9000

6. Source code and development


R2R API is hosted by SourceForge.net as part of the R2R project. The latest source code is available from the project's SVN repository and can be browsed online.
It can be used under the terms of the Apache Software License, Version 2.0.

7. References

  1. Bizer, C., Schultz, A.: The R2R Framework: Publishing and Discovering Mappings on the Web. 1st International Workshop on Consuming Linked Data (COLD 2010), Shanghai, November 2010.
  2. Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. International Journal on Semantic Web & Information Systems, Vol. 5, Issue 3, pp 1-22 (2009)
  3. Franklin, M.J., Halevy, A.Y., Maier, D.: From databases to dataspaces: A new abstraction for information management. SIGMOD Record 34(4), pp. 27–33 (2005)
  4. Vaz Salles, M.A., Dittrich, J., Karakashian, S.K., Girard, O.R., Blunschi, L.: iTrails: Pay-as-you-go Information Integration in Dataspaces. In: Conference of Very large Data Bases (VLDB 2007), 663-674 (2007)
  5. Haslhofer, B.: A Web-based Mapping Technique for Establishing Metadata Interoperability. PhD thesis, Universität Wien (2008)
  6. Euzenat, J., Scharffe, F., Zimmermann A.: Expressive alignment language and implementation. Knowledge Web project report, KWEB/2004/D2.2.10/1.0 (2007)
  7. Alignment API: API and implementation for expressing and sharing ontology alignments