Chris Bizer
Tobias Gauß

The RDF book mashup demonstrates how Web 2.0 data sources like Amazon,
Google or Yahoo can be integrated into the Semantic Web.

The RDF book mashup makes information about books, their authors,
reviews, and online bookstores available on the Semantic Web. This
information can be used by RDF tools and you can link to it from your
own Semantic Web data.

Contents

  1. Introduction
  2. Some Examples
  3. Architecture
  4. Some Use Cases
  5. Book Search
  6. Feedback

News

1. Introduction

The vision of the Semantic Web is to build a global information space consisting of linked data. Major Web data sources like Google, Yahoo, Amazon and eBay have started to make their data accessible through proprietary APIs. This inspired the development of lots of interesting mashups that combine data from a fixed number of sources. See ProgrammableWeb for an overview about the different APIs and mashups.

The goal of the RDF book mashup is to show how Web 2.0 data sources can be integrated into the Semantic Web, meaning that Web 2.0 data can be browsed using generic RDF browsers like Tabulator and can be crawled and cached by Semantic Web search engines like SWSE, SWOOGLE or the Semantic Web Client Library, which will eventually make it possible to query the complete Web using the SPARQL query language.

The basic requirements that data has to fulfill in order to be part of the Semantic Web are:

  1. All entities of interest, such as information resources, real-world objects, and vocabulary terms should be identified by URI references.
  2. URI references should be dereference-able, meaning that an application can look up a URI over the HTTP protocol and retrieve RDF data about the identified resource.
  3. Data should be provided using the RDF/XML syntax.
  4. Data should be interlinked with other data. Thus resource descriptions should contain links to related information in the form of dereference-able URIs within RDF statements or as rdfs:seeAlso links.

The book mashup applies these principles to Web 2.0 data about books. The mashup assigns dereference-able URIs to books, authors, reviews, online bookstores and purchase offers. Whenever a URI is dereferenced, the mashup queries the Amazon API for information about the book and the Google Base API for purchase offers from different bookstores that sell the book. The query results are returned as an RDF description to the client.

2. Some Examples


RDF description for Tim Berners-Lee's Weaving the Web book:

Example reviews for Tim's book:

Example purchase offer for Tim's book:

Information about Tim as an author:

URIs of some more books:

Books having the subject "Computer - Internet"

3. Architecture

The RDF book mashup is implemented as a small PHP script (300 lines of code). Whenever the script gets a lookup call for a book URI, it decodes the ISBN number of the book from the URI and uses the ISBN number to query the Amazon API and the Google Base API for information about the book. The resulting XML responses are turned into an RDF model which is serialized to RDF/XML using RAP - RDF API for PHP.

 


The mashup supports 303 redirects. So when you enter a URI identifying a book, your browser gets redirected to an RDF document describing the book. The source code of the RDF book mashup is available from the RAP SVN.

4. Some Use Cases

This section lists some potential use cases for the RDF book mashup.

4.1 Linking to RDF book descriptions from an HTML page

The book URIs can be used to link RDF descriptions to HTML pages which list books. For instance, we have added RDF links to the W3C list of Semantic Web books. If you include book links with a <link rel="alternate"> tag into the head of your HTML page, surfers can use tools like Piggybank to collect book descriptions from the Web. See Using RDF/XML within HTML for details on how to set links.

4.2 Linking to RDF book descriptions from your FOAF profile

If you are the author of a book, you can add a link from your FOAF profile to the description of your book. This allows surfers that use generic RDF browsers like Tabulator to navigate from your FOAF profile to the description of your book and from there to a bookstore which sells your book.

The example below shows an RDF link from Tim's FOAF profile pointing at his Weaving the Web book:

<foaf:Document rdf:about ="http://www4.wiwiss.fu-berlin.de/bookmashup/books/006251587X">
<dc:creator rdf:resource ="http://www.w3.org/People/Berners-Lee/card#i" />
</foaf:Document>

A second possibility to set a link is to say in your FOAF profile, that you are the same person as one of the autor resources created by the book mashup. This allows a surfer to navigate from your profile to all your books. For instance:

<foaf:Person rdf:about="http://www.w3.org/People/Berners-Lee/card#i">
<owl:sameAs rdf:resource ="http://www4.wiwiss.fu-berlin.de/bookmashup/persons/Tim+Berners-Lee" />
</foaf:Person>

Examples FOAF profiles that link to the book mashup are:

4.2 Linking book descriptions to other public data sources

A central strength of the Semantic Web is that it allows you to set links between information about the same object within multiple data sources. A second publicly available bibliographic data source is the DBLP database containing journal articles and conference papers. The DBLP database is published as linked data by a D2R Server at http://www4.wiwiss.fu-berlin.de/dblp/. The book mashup automatically generates owl:sameAs links between book authors and paper authors in the DBLP database. Using Tabulator, these links allow you to navigate from the description of a book author to his papers in the DBLP database. If an RDF crawler caches information from both sources, the owl:sameAs links can be used to conclude that two URIs refer to the same resource and thus allow SPARQL queries over merged data from both sources.

The links are generated by asking the SPARQL-endpoint of the DBLP database for URIs identifying book authors. If the query for a foaf:person with a specific name returns only one result and as both domains are related, we assume that it is likely enough that we have hit the right person, to set the owl:sameAs link.

An example of such an auto-generated owl:sameAs link is found in the data about Tim
http://www4.wiwiss.fu-berlin.de/bookmashup/persons/Tim+Berners-Lee

<foaf:Person rdf:about="http://www4.wiwiss.fu-berlin.de/bookmashup/persons/Tim+Berners-Lee">
<owl:sameAs rdf:resource="http://www4.wiwiss.fu-berlin.de/dblp/resource/person/100007"/>
<foaf:name>Tim Berners-Lee</foaf:name>
</foaf:Person>

 

5. Book Search

The form below allows you to find the URI that identifies a particular book.

Enter search terms like a part of the author's name or the book title:

6. Feedback

For our last Semantic Web demo, an D2R server publishing the DBLP database as linked data, we could still guess the number of new RDF documents that were added to the Semantic Web (1,2 million documents with altogether around 15 million triples). For the book mashup we don't have a clue any more how many triples we are publishing. Maybe something like a billion?

We are very interested in hearing your opinion about the mashup. Please send comments to Chris Bizer and Tobias Gauss and cc the Semantic Web mailing-list if your comment is of general interest.

Several people have pointed us to other bibliographic data sources that might be used for further mashups:

Ivan Herman has written a script that collects all Book Mashup data from the W3C Semantic Web book list and puts it into a single file, in order to make it easyer to ask SPARQL queries against the data. See his blog post about it. The file is found here.

Further information about our work in the area of the Semantic Web/Web-of-Data is found at
List of our other open source projects @ Freie Universität Berlin

SourceForge Logo