RDF

Resource Description Framework: Information Representation in the Semantic Web

Tobias Pfeiffer, TESOBE
tobias@tesobe.com

Information

We make apps that deal with information.

Modelling Information

Entity Relationship Model

from http://wofford-ecs.org/dataandvisualization/ermodel/material.htm

Modelling Information
as Relational Data

Entities with Attributes

person
id first_name last_name
27 John Doe ← an entity of type “person”
28

Relations

many2one

person
id first_name last_name company_id
27 John Doe 3
28
company
id name
3 ACME Inc.

many2many

person
id first_name last_name
27 John Doe
28
employment
person_id company_id
27 3
company
id name
3 ACME Inc.
4

one2many

person
id first_name last_name
27 John Doe
28
company
id name person_id
3 ACME Inc. 27
4

Modelling Information
in RDF

The Resource Description Framework is a different way to express information.

Information as
Triple Data

Statements are Subject–Predicate–Object triples.

<http://people.org/John_Doe> <http://xmlns.com/foaf/0.1/knows> <http://people.org/Jane_Doe>

<http://people.org/John_Doe> <http://xmlns.com/foaf/0.1/image> <http://john.com/john.jpg>

<http://people.org/John_Doe> <http://xmlns.com/foaf/0.1/givenName> "John"

<http://people.org/John_Doe> <http://xmlns.com/foaf/0.1/age> 28^^xsd:integer

<http://people.org/John_Doe> <http://xmlns.com/foaf/0.1/name> "ジョン ドウ"@ja

Different Notations I

N-Triples

<http://people.org/John_Doe> <http://xmlns.com/foaf/0.1/knows> <http://people.org/Jane_Doe>.

Turtle

@prefix ppl: <http://people.org/>
@prefix foaf: <http://xmlns.com/foaf/0.1/>

ppl:John_Doe foaf:givenName "John";
ppl:John_Doe foaf:familyName "Doe".

ppl:John_Doe foaf:knows ppl:Jane_Doe,
ppl:John_Doe foaf:knows <http://facebook.com/richard.miles>.

Different Notations II

RDF/XML

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:foaf="http://xmlns.com/foaf/0.1/">
  <rdf:Description rdf:about="http://people.org/John_Doe">
    <foaf:givenName>John</foaf:givenName>
    <foaf:knows rdf:resource="http://people.org/Jane_Doe" />
  </rdf:Description>
</rdf:RDF>

RDFa

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
    "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    version="XHTML+RDFa 1.0" xml:lang="en">
  <head>
    <title>John's Home Page</title>
  </head>
  <body about="http://people.org/John_Doe">
    <h1>John's Home Page</h1>
    <p>My name is <span property="foaf:givenName">John</span> and I happen
      to know <a href="http://people.org/Jane_Doe" rel="foaf:knows">Jane</a>.
    </p>
  </body>
</html>

Information as a
Directed Graph

Querying for Information

Querying for Information
in RDBs

Query tables.

Hello from SQL

SELECT first_name
FROM person
WHERE last_name = 'Doe';

Result:

first_name
(varchar)
John

Asking about relations

SELECT first_name, last_name, employer_id
FROM person
JOIN company ON (person.employer_id = company.id)
WHERE company.name = 'ACME Corp.';

Result:

first_name
(varchar)
last_name
(varchar)
employer_id
(int)
John Doe 3

Querying for Information
in RDF

Query graphs, not tables.

Hello from SPARQL

“All resources with last name ‘Doe’”

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?person ?first_name
WHERE {
  ?person foaf:givenName ?first_name .
  ?person foaf:familyName "Doe" .
}

Result:

person
<http://people.org/John_Doe>
<http://people.org/Jane_Doe>

person first_name
<http://people.org/John_Doe> "John"
<http://people.org/Jane_Doe> "Jane"

Asking About Relations

“All resources that are member of ‘ACME Corp.’”

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?first_name ?last_name ?company WHERE { ?person foaf:givenName ?first_name . ?person foaf:familyName ?last_name . ?company foaf:member ?person . ?company foaf:name "ACME Corp." . }

Result:

first_name last_name company
"John" "Doe" <http://www.example.com/acme>
"Richard" "Miles" <http://www.example.com/acme>

Advanced Topics

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?first_name ?last_name ?gender
WHERE {
  ?person foaf:givenName ?first_name .
  ?person foaf:familyName ?last_name .
  OPTIONAL { ?person foaf:gender ?gender } .
  ?person foaf:knows{1,2} ?employee .
  ?employee foaf:age ?age .
  ?company foaf:member ?employee .
  ?company foaf:name "ACME Corp." .
  FILTER (?age < 30)
}

Result:

first_name last_name gender
"Jane" "Doe" "female"
"Robby" "Robot"

 

(plus: GROUP BY, HAVING, ORDER BY, subqueries, regex matching, …)

Storing Information

Storing Relational Data

Use a relational database ;-)

Storing RDF Data

Use a triple store.

RDB as a Backend

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?first_name ?last_name ?company WHERE { ?person foaf:givenName ?first_name . ?person foaf:familyName ?last_name . ?company foaf:member ?person . ?company foaf:name "ACME Corp." . }

could map to

SELECT t1.obj, t2.obj, t4.obj
FROM data t1
JOIN data t2 ON (t1.sub = t2.sub)
JOIN data t3 ON (t3.obj = t1.sub)
JOIN data t4 ON (t4.sub = t3.sub)
WHERE t1.pred = 'foaf:givenName'
AND t2.pred = 'foaf:familyName'
AND t3.pred = 'foaf:member'
AND t4.pred = 'foaf:name'
AND t4.obj = 'ACME Corp.'

or to

SELECT givenName.obj, familyName.obj, name.obj
FROM givenName
JOIN familyName ON (givenName.sub = familyName.sub)
JOIN member ON (member.obj = givenName.sub)
JOIN name ON (name.sub = member.sub)
WHERE name.obj = 'ACME Corp.'

Selected Triple Stores

Name Language Website
Apache Jena Java jena.apache.org
   → implements a lot of SPARQL features
4store C www.4store.org
   → fast, small, robust
OpenLink Virtuoso C virtuoso.openlinksw.com
   → powers dbpedia, probably scales well ;-)
Sesame Java www.openrdf.org

 

  • Could be used directly in application
  • or with SPARQL via HTTP

Use Shared Information

How can we consume and extend other peoples’ data?

Using Shared
Relational Data

Data Sources

Issues

  • Different names and meanings:
    fname lname
    John Doe
    givenName familyName compId
    John Doe 4
    name company
    John Doe ACME Corp.
  • Encodings
  • Naming conflicts
  • Duplicate information
  • Merging issues
  • Missing documentation

Using Shared RDF Data

Data Sources

  • dbpedia: an RDF version of Wikipedia
    SELECT ?personName ?langName WHERE {
      ?person dbpedia-owl:knownFor ?language .
      ?language rdf:type dbpedia-owl:ProgrammingLanguage .
      ?language dbpprop:year ?year .
      ?person rdfs:label ?personName .
      ?language rdfs:label ?langName .
      FILTER(LANG(?personName) = 'en' && LANG(?langName) = 'en')
    } ORDER BY DESC(?year) LIMIT 3
    personName langName
    "Donald D. Chamberlin"@en "XQuery"@en
    "Patrick Collison"@en "Croma"@en
    "Martin Odersky"@en "Scala (programming language)"@en
  • Linked Open Data

Import vs. Online Access

Often RDF dumps and SPARQL endpoints are provided.

Import

  • + offline access (import into its own “graph”)
  • + always available
  • – large mass of data
  • – admin work necessary

Online Access

  • + add new data sources quickly
  • + less server resources necessary
  • – SPARQL endpoint may be down/slow
  • – data may change/disappear

Federated Queries

  • Automatic federation is a current research topic, cf. DARQ (dead), SQUIN, FedX.
  • Manual federation via SERVICE keyword.
PREFIX foaf:   
SELECT ?name
WHERE
{
  <http://people.org/John_Doe> foaf:knows ?person .
  SERVICE <http://people.example.org/sparql> {
    ?person foaf:name ?name .
  }
}

Federated Queries:
Architecture Proposal

Architecture for federated SPARQL queries

Conclusion

  • Use RDBs if your data is tabular.
  • Consider RDF for graph-like data.
  • Consider RDF if you want to use 3rd-party data.
  • Choose your triple store wisely.

 

Things I didn’t talk about

  • Inference
  • Popular vocabularies
  • Attributes for relations

Thank You!