Table of Contents

Introduction

Triple is an RDF query, inference, and transformation language for the semantic web. I use the term "Triple" in this tutorial for both the language and the inferencing engine which lets you use the language. The engine is integrated into the semantic web context by providing import facilities for RDF/XML and RDF/N3. A set of Triple rules implement a subset of OWL, specifically OWL Lite-.

In contrast to procedural programming languages such as C or Java, Triple is a declarative language which shares some similarities with SQL or Prolog. Triple lets you work with programs that consist of facts and rules from which Triple can draw conclusions for answering queries.

This tutorial tries to explain (informally) three things: first, how to use Triple; second, how to make sense of Logic; and third, how Triple works. In addition, I outline some possible efforts to extend Triple in the conclusion.

This tutorial should introduce you to the concepts needed to apply Triple to tasks on the semantic web. If I failed to make everything easily understandable, please let me know. If you don't like one or more sections, I encourage you to contribute better explanations. Comments and suggestions are very welcome. Also, note that the current version is still a draft.

How to use Triple

In the following, we will cover things you need to know if you want to use Triple (on the semantic web). We assume you are familiar with RDF. If not, read the RDF Primer Primer for a three-page introduction to RDF/N3.

Hello World

In good tradition of introductory text for programming languages, let's start with a very simple "hello world" example. Note, since Triple is not a procedural language but a declarative one, we will first add a fact to its knowledge base, and then ask a query to retrieve parts of the fact. A fact is an RDF statement.

subject[predicate->"Hello World"]@hellomodel.
Hello World in Triple

Now, you can add the fact to Triple's knowledge base. You can cut-and-paste the above line into the online Triple server, and hit the "Add" button.

In order to get parts of the fact out of Triple's knowledge base, you have to pose a query.

FORALL X, Y, Z <- X[Y->Z]@hellomodel.
Triple query

If you cut-n-paste the above query into Triple's server page, you get as result an RDF/XML file that contains the statement subject[predicate->"Hello World"].

The "hello world" example was a very simple demonstration how to add facts to Triple's knowledge base, and retrieve them later with a query. However, the power of Triple is in a construct called rules.

Running Triple

Three ways to use Triple:

The first two options involve installing Triple on your machine.

There are currently two ways of how to use the Triple inferencing engine. You can install Triple if you run Linux and if you are somehow familiar with compiling and installing Java software with ant and C software with configure and make. Alternatively, you can use the public Triple service that is accessible at http://triple.semanticweb.org:3030/.

Add
Add facts or rules to the knowledge base. Return ok if the addition could be carried out correctly, or raise an exception if 1) there has been a problem with accessing the files (file IO) or 2) the addition would violate any range/domain restriction. In case 2, what should be done with reasoning? The problem is that it is very likely that data on the internet is in inconsistent state. The inconsistency should be marked, but inferencing should continue (are the domain/range restrictions used in the reasoning process anyways?)
Remove
Remove a model from the knowledge base.
Query
A query returns RDF/XML as result.

Triple allows to use the functionality via a Java API. Preferred way of importing RDF is N3, and we strive to provide an N3 parser for rules as well. The Triple API offers the following fundamental operations.

The recommended way of using Triple is over the network. Use the Triple server that is available online. The interface to the Triple server's functionality is plain HTTP. @@@ We will provide links and show how to invoke the Triple service in Java.

Might be Triple over the network. @@@ not really part of the language tutorial, but has implications for distribution and results handling Describe how Triple fits into the Web infrastructure (HTTP GET, HTTP PUT etc) for queries and additions/changes to the knowledge base.

Best to load triple files over HTTP based on content-type.

Triple has an interface where you can use the three basic operations over the HTTP protocol. We will provide Java code examples that show how to

  1. add formulas/sentences via HTTP PUT and HTTP GET
  2. ask queries over HTTP POST and HTTP GET
  3. remove a model using HTTP GET (@@@ is there a more suitable action in HTTP for this?)

Rules Example

Triple programs consists of facts and rules. A brief example can illustrate how to use facts, rules, and queries.

An RDF statement (or fact) consisting of Subject, Predicate, and Object in Triple is expressed S[P->O]. @@@ why not have N3 as Triple syntax? The following means that Andreas is of type PhDStudent. You can define facts, such as "Andreas is a PhD student" either in RDF/XML, RDF/N3, or Triple syntax: andreas[rdf:type->phd_student].

Further, you can define rules. Rules are a powerful mechanism that allows Triple to infer new facts. For example, you can state rules that express "all PhD students are underpaid": FORALL X X[is->underpaid] <- X[rdf:type->PhDStudent].

Next, you can pose queries. For example, you could ask for "all people who are underpaid": FORALL X <- X[is->underpaid]. The answer to the query above would be: X = andreas.

Triple concluded the fact "Andreas is underpaid" by reasoning over facts and rules that existed in Triple's knowledge base. Note that the new (inferred) fact was originally not stated.

Path Expressions

For navigation purposes, path expressions have proved to be very useful in object oriented languages. Triple allows the usage of path expressions instead of subject, predicate, or object definitions (and at all other places where terms are allowed). Path expressions are dot-delimited sequences of resources, e.g., stefan.spouse.mother denotes Stefan's mother in law.

Namespaces

URIs are used on the Web to identify things. @@@ fix triples use of namespaces (ie. allow ~ and - in model URIs, logical symbols are defined using URIs, ...).

TRIPLE has special support for namespaces and resource identifiers. Namespaces are declared via clause-like constructs of the form nsabbrev := namespace., e.g.: rdf := http://www.w3.org/1999/02/22-rdf-syntax-ns#." Resources are written as nsabbrev:name, where nsabbrev is a namespace abbreviation and name is the local name of the resource. Resource abbreviations can be introduced analogously to namespace abbreviations, e.g. isa := rdfs:subClassOf.

Reified Statements

Reified Statements are written as < statement > and can be used inside other statements, allowing "modal" statements like stefan[believes→<Ora[isAuthorOf→homepage]>].

Anonymous Resources

@@@ extend Triple to handle them (see RDF parser/generation code).

Parameterized Models

Models can be used to group statements or rules together. @@@ throw out Horn logic and other stuff in the following paragraphs.

Instead of having a single stack of RDF, TRIPLE allows to split up RDF statements and rules into models. Models can be passed to other models as parameters.

To assert that a set of clauses is true in a specific model, a model block is used: @model { clauses }, or, in case the model specification is parameterized: ∀ Mdl @model(Mdl) { clauses }.

RDF Models, i.e., sets of statements, are made explicit in Triple ("first class citizens"). Statements, molecules, and also Horn atoms that are true in a specific model are written as atom@model, where atom is a statement, molecule, or Horn atom and model is a model specification (i.e., a resource denoting a model), e.g.: michael[hasAge→34]@factsAboutDFKI.

Triple also allows Skolem functions as model specifications. Skolem functions can be used to transform one model (or several models) into a new one when used in rules (e.g., for ontology mapping/integration):

Figure #MOD shows how parameterized models can be used to transform/integrate multiple models (which can be ontologies, other data models, or instance data).

parameterized models
Figure #MOD: Information Integration With Parameterized Models

If all (or many) statements/molecules or Horn atoms in a formula are from one model, the following abbreviation can be used: specifying the union of two models, and specifying the set-difference of two models. and can be used inside other statements, allowing "modal" statements like

Models are useful to group a set of statements together. Also, models can be used to define views on data, by only integrating the interesting part of an ontology. Then, a database administrator can give a user restricted access to only parts of an ontology that the user is allowed to view (and possibly also update).

OWL Lite- using Triple

The current version with the Triple file containing rules for processing files using OWL Lite- is available online.

@@@ change that text to describe the OWL Lite- rules. This section shows how rules axiomatizing (part of the) semantics of RDF Schema are implemented in Triple. The rules can be used together with a Horn logic based inference engine like XSB to derive additional knowledge from an RDF Schema specification.

rdf := 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'.
rdfs := 'http://www.w3.org/2000/01/rdf-schema#'.
type := rdf:type.
subPropertyOf := rdfs:subPropertyOf.
subClassOf := rdfs:subClassOf.
FORALL Mdl @rdfschema(Mdl) {
  transitive(subPropertyOf).
  transitive(subClassOf).
  FORALL O,P,V   O[P->V] <-
      O[P->V]@Mdl.
  FORALL O,P,V   O[P->V] <-
    EXISTS S   S[subPropertyOf->P] AND O[S->V].
  FORALL O,P,V   O[P->V] <-
    transitive(P) AND
    EXISTS W   (O[P->W] AND W[P->V]).
  FORALL O,T   O[type->T] <-
    EXISTS S   (S[subClassOf->T] AND O[type->S]).
}

Figure #RDFS shows the RDF Schema module in plain ASCII notation.

The first lines define namespaces (for RDF and RDF Schema) and abbreviations (for type, subPropertyOf and subClassOf).

The rules are enclosed by a model specification block: ∀ Mdl @rdfschema(Mdl) { ... }.

The Skolem function rdfschema(Mdl) is the model identifier of all facts derived by the rules enclosed by the model specification block. The parameter Mdl denotes the RDF schema specification. The model rdfschema(Mdl) contains all statements from the model Mdl plus everything derived additionally by the rules. The rule FORALL O,P,V O[P->V] <- O[P->V]@Mdl. specifies that every triple contained in the model Mdl is also an element of the model with the identifier rdfschema(Mdl). The next rule defines the inheritance of values from sub properties to super properties. The remaining rules define the semantics of transitive properties (rdf:subPropertyOf and rdf:subClassOf) and of the rdf:type property.

In Figure #VEC, a simple RDF schema for motor vehicles is given: the root class is xyz:MotorVehicle, which has the direct subclasses xyz:PassengerVehicle, xyz:Truck, and xyz:Van. xyz:MiniVan is defined as a common subclass of xyz:Van and xyz:PassengerVehicle (multiple inheritance).

@cars {
   xyz := "http://www.w3.org/2000/03/example/vehicles#".    
   xyz:MotorVehicle[rdfs:subClassOf -> rdfs:Resource].
   xyz:PassengerVehicle[rdfs:subClassOf -> xyz:MotorVehicle].
   xyz:Truck[rdfs:subClassOf -> xyz:MotorVehicle].   
   xyz:Van[rdfs:subClassOf -> xyz:MotorVehicle].
   xyz:MiniVan[
     rdfs:subClassOf -> xyz:Van;
     rdfs:subClassOf -> xyz:PassengerVehicle]. 
}
Figure #VEC: RDF Schema Example

The following query searches for all direct and indirect subclasses of xyz:MotorVehicle, using the RDF Schema definition for rdfs:subClassOf as defined in the rdfschema(Mdl) model: FORALL C <-C[rdfs:subClassOf->xyz:MotorVehicle]@rdfschema(cars).

This is achieved by passing the ontology (cars) as a parameter to the RDF Schema rules, whereas the query FORALL C <-C[rdfs:subClassOf->xyz:MotorVehicle]@cars. results in just the direct subclasses of xyz:MotorVehicle.

rdf:type
do reasoning over type hierarchies and specify that an instance is of a certain type
rdfs:subClassOf
if we have #a rdfs:subClassOf #b, does that mean that #a rdf:type owl:Class? (or rdfs:Class)
rdfs:subPropertyOf
state that one property is a sub-property of another
owl:equivalentClass
useful for mappings: say that two classes are equivalent (I don't know why it is more expensive to say that two classes are equivalent than two individuals are the same)
owl:equivalentProperty
useful for mappings: say that two properties are equivalent

@@@ specify here and owl_lite-(Mdl) model.

WSMO/WSML document describing OWL Lite-, a subset of OWL Lite that is expressible in DLP.

How to make sense of Logic

Triple is inspired by frame logic (f-logic), which is a syntactical extension to first-order logic. However, Triple doesn't have epistemological constructs (type, subclass, ...) built-in. Theses constructs are described using rules, making Triples a very flexible language. "One of the most important differences to F-Logic and SiLRI is that TRIPLE does not have a fixed semantics for object-oriented features like classes and inheritance. Its layered architecture allows such features to be easily defined for different object-oriented and other data models like UML, Topic Maps, or RDF Schema." [Sintek and Decker 2002].

In the following, we give a short introduction to basic concepts of logic, as they are helpful to understand Triple. For a more detailed (and formal) introduction to logic, consult your favorite textbook about logic. @@@ what's that guys name again that has a good textbook about logic?

First, we will introduce the main constructs of the Triple language. Then, we will describe how these constructs can be simplified and compiled into a more simple form that is equivalent in terms of results. Finally, we sketch how a resolutionskalül works. @@@ that definitely need some work

Sentences/Formulas

@@@ is that right? Triple is build on sentences that consist of terms. Sentences can be either facts, queries, or rules. The basic form of a descriptive sentence is "head ⇐ body", whereas body is the condition and head is the inferred new fact. Rules consist of both head and body — the fact in the head is true if the conditions in the body are true. You can see the body as a set of prerequisites, and the head as the conclusion. Terms in the body can be connected using conjunction (∧) and disjunction (∨). Facts consist only of a head (empty body), queries consist consist only of a body (empty head).

Molecules

Resources or literals (string, integer etc). @@@ elaborate

Connectives

Conjunction (∧), disjunction (∨), negation (¬), if-and-only-if (⇐). @@@ elaborate

Quantifiers

Existential quantifiers (∃), universal quantifiers (∀). @@@ elaborate

Variables

Variables can be used in body and head, both uppercase and lowercase variables are allowed. Variables are introduced by quantors (∀ and ∃).

Truth Tables

a ⇒ b ≡ ¬ a ∨ b.

A B ¬ A ∨ B A ⇒ B
F F T T
F T T T
T F F F
T T T T

∀ x p ⇐ q(x) ≡ p ⇐ ∃ x q(x).

Exercise

∀ x p ⇐ q(x).

∀ x p ∨ ¬ q(x).

p ∨ ∀ x ¬ x q(x).

p ∨ ¬ ∃ x q(x).

p ⇐ ∃ x q(x).

Skolemization

get rid of ∃, end up with only universally quantified variables.

Function symbols: ∀ x ∃ y φ(x,y) ≡ ∀ x φ(x, sk(x)).

sk is a "new" function symbol.

domain D = {a,a}.

Interpretation: φ {<a,a>, <b,b>}

sk(x)= x.

φ = {<a,a>, <b,b>} sk(a) = a sk(b) = b

φ = {<a,b>, <b,a>} sk(a) = b. sk(b) = a.

Ok, so.

∀ x, y, z ∃ g gives you φ (x,y,z, g) which you translate into sk(x,y,z), and that ends for you in suitable format (conjunction of disjunctions). sk(x,y,z) will resolve at the end of the day in a mapping to a constant, which can be the answer of some query. @@@ use universe and interpretation for that?

∀ x p(x) ∧ q(x) ∧ z(x). can be split into ∀ x p(x) ∧ ∀ x q(x) ∧ ∀ x z(x). In that case, ∀ x p(x) is a clause.

Conjunctive Normal Form

You can bring any formula into conjunctive normal form.

Bring formula into "Praenex" normal form:

∀ x P(x) ∧ ∀ g P(g) ≡ ∀ x,y P(x) ∧ Q(y).

ok, next (A and B are two possible ways of reading the formula):

A: (∀ x p(x)) ⇒ (∀ y q(y)) ≡ ¬ ∀ x p(x) ∨ ∀ y q(y).

B: ∀ x (p(x) ⇒ ∀ q(y)) ≡ ∀ x ¬ p(x) ∨ ∀ y q(y) ≡ ∀ x, y ¬ (p(x) ∨ q(y)).

¬ ∀ x p ≡ p ∃ x ¬ p.

∃ x, ∀ y (¬ p(x) ∨ q(y)).

The formulas ∀ y ∃ x p(x,y). and ∃ x ∀ y p(x,y). are different; the first says forall y there exists x, the second there exists an x forall y. Order of quantifiers is important.

Exercise

A ⇔ (B ⇒ C).

Apply (a ⇔ b) ≡ (a ⇒ b) ∧ (b ⇒ a).

A ⇒ (B ⇒ C) ∧ A ⇐ (B ⇒ C).

A ⇒ (¬ B ∨ C) ∧ ¬ (B ⇒ C) ∨ A.

¬ A ∨ (¬ B ∨ C) ∧ [(B ∧ ¬ C) ∨ A]

¬ A ∨ (¬ B ∨ C) ∧ (A ∨ B) ∧ (A ∨ ¬ C).

is conjunctive normal form (@@@ check)

Now, how to do the transformation to disjunctive datalog?

Disjunctive Datalog

P1 ∨ ... ∨ Pn ⇐ Q1 ∧ ... ∧ Qm.

Horn Rules

Formulas with only one positive literal are called horn rules.

We differentiate the following cases:

n = 1, m ≥ 1 P1 ⇐ Q1 ∧ ... ∧ Qm horn rule
n = 1, m = 0 P1 fact
n = 0, m ≥ 1 ⇐ Q1 ∧ ... ∧ Qm query
n = 0, m ≥ 0 contradiction

Triple has facts (can be RDF), rules (encoded in Triple language syntax, will change to RDF), and queries (encoded in Triple syntax, will change to RDF). Triple might change to something like N3+.

Herbrand

Herbrand models are a mapping between ground terms and elements of a domain (???). @@@ not clear why this is needed

ground term element of domain
a "a"
b "b"
sk(a) "sk(a)"
sk(b) "sk(b)"

C = {a,b} Uni.... {a', b'}

I(a) = a' I(b) = a' ???

Unification Algorithm

worst-case: O(exp), but mostly better.

t1 = f(g(X), Z). t2 = f(Y, q).

For the above, the nmg (@@@what does nmg stand for?) ngm(t1, t2) = {Y/g(X), Z/q}

t1 = f(g(Y), Z). t2 = f(Z, q).

There exists no nmg for the above.

t1 = f(g(X), Y). t2 = f(Z, Z).

ngm(t1, t2) = Z/g(x), Y/g(x).

p(a). ¬ p(a). ----------- ⇐

above is equivalent to

p(a) ⇐ ⇐ p(a). ------------ ⇐

similar

p(a). ∀ x ¬ p(X). ---------------------- ü (⇐) {X/a}

more complicated:

p(X) ∨ q(a) ⇐ ...

      p(X) ⇐ q(X). (rule)
      q(a).  (fact)
      -----------------
      p(a).
      ⇐ p(X).  (query)
      -----------------
      ü X/a. (contradiction: result)
    

for p(a) ∨ p(X) ∧ p(a). you need only p(a) to show a ????

3. p(b) ⇐ p(X). 2. p(a). 1. ⇐ p(X).

      1. ⇐ p(Y).
      3. p(X) ⇐ p(X).
      --------------------
      4. ⇐ p(X) nmg: X/Y
    
      1. ¬ p(Y).
      3. ¬ p(X) ∨ p(X).
      ------------------------
      4. ⇐ p(X) nmg: X/Y.
      2. p(a). ⇐
      ------------------------
      5. ü X/a (contradiction; result)
    

Glossary

Informally describe some terms that often come up in discussions about logic these days.

Logical model

Model is an assignment for the variables for which a set statements is true. Proving something means to find a model.

If a formula has a models then the formula is satisfiable.

A ∨ B A = 1; B = 0; (true) (model a) B = 1; B < 1; (true) (model b) ...

The way to prove things is such that you find the minimal model. I.e. if you have the above plus C=1;D=1, then you only find the model (A=1; B=0), which is the minimal model.

domain, interpretation function, maps is a tuple (delta, Idelta.)

Minimal Model Semantics

Minimal model semantic vs. model theoretic semantic??? different ways to compute models for programs

Datalog has minimal model semantics, that means that there exists only one model for a given (???set of axioms???). That means, a ∨ b. cannot be expressed in datalog. Datalog only has conjunctive formulas???

Transitive Closure

Transitive closure over binary relations is definable/characterizable in datalog because it has minimal model semantics. Characterizing (defining) transivity would mean to have a formula that is true when relation a is the transitive closure of b.

see 2.1.7 A Limitation of First Order Logic, page 14,15 (Example 2.1.1)

Two binary relations T, G. Naive way of specifying T is the transitive closure of G:

∀ x,y T(x, y) ⇔ G(x, y) ∨ (∃ z) [ T(x,z) ∧ G(z,y) ]

Structure with universe {a, b} that interprets G as {(b,b)} and T as {(a,b),(b,b)}. Test that the structure is a model of the above definition:

∀ x, y T(a,b) ⇔ G(a,b) ∨ (∃ z) [T(a,z) ∧ G(z,b)]. (true ⇔ true)

∀ x, y T(b,b) ⇔ G(b,b) ∨ (∃ z) [T(b,z) ∧ G(z,b)]. (true ⇔ true)

∀ x, y T(a,a) ⇔ G(a,a) ∨ (∃ z) [T(a,z) ∧ G(z,a)]. (false ⇔ false)

∀ x, y T(b,a) ⇔ G(b,a) ∨ (∃ z) [T(b,z) ∧ G(z,a)]. (false ⇔ false)

But: G is not the transitive closure of T, because if T would be the transitive closure of T it would only contain (b,b). Since T also has (a,b), the model T is not minimal.

Full First Order Logic

@@@FOL = Full First Order Logic.

F-Logic

F-Logic = Kiefer, Lausen We 1995, Syntaktische Erweiterung von FOL, compilierbar auf FOL

Horn Logic

Hornlogic = Teilmenge von FOL = Klauseln mit hoechstens einem positiven literal - darstellbar als implikationen H <- B1 and ... And Bn

Datalog

Datalog = Teilmenge von FOL, Datenbank-spezifisch (Horn-logik - Functiontion symbols)

Logic Programming

Logic programming: horn clauses

Integrity constraints vs. DL

@@@ provide the example from the mail to Jos.

What to do in case an instance doesn't obey the schema? Raise an exception (database semantics)? Or classify the stuff (DL semantics)?

Terms

Terms is a constant (a), a function symbol (f(a)), or a variable (X). A terms is closed if it contains no variables.

Atoms

Atoms are predicate symbols (f(a)) or terms.

@@@ what's the difference between predicate symbols and function symbols.

Literal

Atom or negated atom.

Formula

A formula is a number of literals together with connectives. @@@ ?

Function symbols

@@@ What's the difference between function symbols and relation symbols? Relation symbols is the outer stuff, function symbols what's inside relation symbols.

Nominals

Nominals enumerate things. For example, the nominal DayOfWeek consits of Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday, and nothing else. The "nothing else" is what makes nominals hard to deal with in reasoners.

An example on how to express nominals in OWL is given in the following:

<owl:AllDifferent>
  <owl:distinctMembers rdf:parseType="Collection">
    <Day rdf:about="#monday"/>
    <Day rdf:about="#tuesday"/>
    <Day rdf:about="#wednesday"/>
    <Day rdf:about="#thursday"/>
    <Day rdf:about="#friday"/>
    <Day rdf:about="#saturday"/>
    <Day rdf:about="#sunday"/>
  </owl:distinctMembers>
</owl:AllDifferent>

Blank nodes

say why EXISTS X X[name->"Eve"]. does not mean "there exists a resource with name "Eve".

Unique-Names Assumption

unique-names assumption? Does this mean we assume that very object has one unique name? Not really practical for use in FOAF for example.

Equivalence

Does triple have equivalence? Could be used to model classes (A owl:unionOf B, A owl:unionOf C). A equiv B or C.

How Triple works

In the following chapter, I describe how Triple works. That is, architecture,

Architecture

As already mentioned, Triple is a modular rule language. It allows the definition of modules for semantic extensions of RDF like RDF Schema, UML, Topic Maps, OIL, and DAML+OIL, implemented either directly as Triple rules or via interaction with external reasoning components @@@ is there an API for specifying the interaction?.

TRIPLE compiles programs in the TRIPLE language to logical programming clauses that can be passed to XSB.

idealized architecture of compilation steps
Figure. Idealized architecture of compilation steps

Ideally, the compilation steps should go from Triple language to LP, and for results from LP up to Triple language objects again. However, the step from LP up is not implemented yet (see Result Handling in the conclusion for more).

RDF/N3 should go through all the compilation steps as well (yes, that is a performance penalty, but should be sacrificed due to clean design). That means, RDF/N3 will be put into a triple.language.TripleExpression object and processed through the stack.

An RDF statement is equal to a lp.Literal, which consists of a lp.FunctorTerm("triple", {subj, pred, obj}) subj and pred can be lp.Terms[2], with localname and namespace (@@@ read about QNames, and how the localname is to be determined) maybe internally it is ok to use the whole uri and don't make that ns distinction?

The following describes what I was able to figure out from the Triple source code.

  1. org.semanticweb.triple.parser.LTermParser parses TRIPLE input files
  2. org.semanticweb.triple.parser.TripleTransform transforms the parse tree into org.semanticweb.triple.language
  3. convert to org.semanticweb.glp
  4. convert to org.semanticweb.lp
  5. serialize the lp clause to Prolog/pass to XSB

Parser

@@@ describe parser here. should be RDF/N3 later on, that can use external services (created by Innsbruck) that convert from the WSML myriad of syntaxes to RDF/N3.

org.semanticweb.triple.parser.Context is where the parse tree is stored. LTerm stores a model (huh? what is that?), and abbreviations (variables?).

Triple Language

@@@ Triple language is a set of objects that can be derived from parser component.

The Triple language is the syntax how expressions (facts, rules, and queries) are expressed. Expressions are stored in the TripleExpression class, the root of all Triple language expressions, which can consist of various subclasses.

General Logic Programming

@@@ What step needed to come from Triple to GLP?

Compiles down the TripleExpression to GLPExpression.

Logic Programming

@@@These are the objects that can be passed to XSB.

Compiled to Horn-logic clauses with function symbols.

subj[pred->obj]. becomes then true(triple(resource('_', 'subj'), resource('_', 'pred'), resource('_', 'obj')), resource('_', '_')).

XSB

Triple is written in Java, and uses XSB as backend engine. Because XSB is based on C, JNI is used to bridge the Java layer and the backend engine.

@@@ describe the brr JNI interface here

@@@ say a few words about XSB.

What is passed from parser/language to glp?

What classes do the transformation of the objects? Is storage and transformation separated? Or is it intermingled?

Conclusion

In the following, I describe future work that could be interesting or is even needed for more advanced functionality. I do not say that the functionality can be easily implemented, or implemented at all. The following points are just worthwhile to consider for subsequent versions of Triple.

Results Handling

@@@ what would be a clean way to represent queries? That is, treat it as logic formula, and don't make a distinction between tell and ask? but only have tell? What happens if I ask two queries? Maybe: for each query, return a token where the results can be retrieved? Or provide a model where the results go to? If no model is given, put results into default model? That means, if I make three queries and don't specify a model where they should go, then _all_ results (combined) are in the default model?

@@@ how is result handling currently being carried out? Triples (statements) should be the atomic objects that Triple can deal with. Provide variable bindings (X=10) is not scope of Triple. That way, we get a clean interface and a good seperation of what Triple provides.

Answers to a query is just a list with variables and their bindings. @@@ of what type (string, integer, etc) are the returned results?

The results of queries should be available in the triple.language.* classes, that then can be easily serialized into RDF/N3 with rules extensions. That would make it easier to integrate distributed Triple instances.

Distributed Architecture

owl:import has to be implemented somehow. (This can be quite easily done in the RDF parsing layer).

However, to do owl:import correctly, the engine should be able to attach function calls to predicates. I.e. owl:import triggers a piece of code that fetches the RDF from the given URL into the knowledge base (or just uses the other Triple reasoner that listens at the URL specified at owl:import for reasoning steps to perform distributed reasoning). The rdfs:seeAlso predicate could follow a similar approach of importing or integrating the knowledge base (be that file or a Triple instance) behind that URI.

The basic operations then could be: get (a model), put (a model), remove (a model). Later, an optimized way of performing updates (instead of removing the old model and adding the new model) could be desirable. If models become very large, remove and add operations can get very costly. Additionally, an update method can help to optimize the reasoning such that if a fact or rule is updated that only affects a small change in the knowledge base. The smallest possible change can be on a statement level (change subject, predicate, or object of a triple). A natural mapping to HTTP PUT and HTTP GET are then available (and maybe a possibility to use WebDAV for updates).

However, the reasoner should not rely on the HTTP functionality, and the core of the reasoner should be modularized and work stand-alone on a very small *.jar footprint that it can be easily integrated and reused in other projects.

Incremental Updates

There should be a possibility to commit small changes to a certain piece in the knowledge base. Concurrency is not yet supported here, for the duration of the update the knowledge base is simply locked. The question is how to apply diffs in a way that would facilitate reasoning performance (i.e. avoid the necessity to recompute everything once one fact has been retracted).

An update feature could make sense to commit small changes (ie. via diff) and therefore make updates more optimized in the knowledge base.

The ability to perform incremental updates is an important feature for an online server that is

Full N3 Support

Adapt full N3 support, which gives access to all W3C test cases. If desired, converters from and to other languages to N3 (with the {} extensions) can be provided. Maybe N3 needs a small extension to handle the notion of models (or cast owl:import or triple:import to simulate that feature). N3 could serve as lowest common denominator.

Predicates that do stuff/built-in predicates can be useful to implement advanced functionality while retaining the (syntactical) validity of an N3 file/document.

Extension to N3: rules have URIs as well and you can specify a model where you store things (facts, queries, and rules).

Built-ins: They are predicates (properties) which cwm recognizes and handles specially. They are "built in" to cwm. In rules terminology they are procedural attachments for sensing. They provide input to rules. (http://www.w3.org/2003/Talks/0520-www-tf1-b3-rules/slide30-0.html, http://www.w3.org/2000/10/swap/doc/Built-In.html)

Pure Java

In order to achieve most of the functionality mentioned above, access to the core inferencing engine is required. Therefore, we need a pure Java inferencing engine that allows us to make changes to the core.

Acknowledgements

Some sections of this document are based on Michael Sintek's und Stefan Decker's "Using TRIPLE for Business Agents on the Semantic Web".

Most of the logic stuff out of Stefan's brain.

Discussion with Michael Stollberg, Eyal Oren, and Wolf Winkler.

Cited Work

@@@ fix references

Gerd Wagner, Said Tabet, Harold Boley. "MOF-RuleML: The Abstract Syntax of RuleML as a MOF Model".

OWL Spec, W3C

Deborah L. McGuinness and Frank van Harmelen (eds.). "OWL Web ontology Language Overview". W3C Recommendation 10 February 2004.

Michael Sintek, Stefan Decker. "TRIPLE - A Query, Inference, and Transformation Language for the Semantic Web". International Semantic Web Conference (ISWC), Sardinia, June 2002.

Guizhen Yang and Michael Kifer. "On the Semantics of Anonymous Identity and Reification".


Andreas Harth
$Id: tutorial.html 218 2004-08-05 16:39:56Z aharth $