JQbus - Jabber chat query services

Update: this work continues in 2009 as Buttons...

For discussion of this and SPARQL/XMPP bindings, please join the foaf-protocols list...

JQbus uses off-the-shelf Jabber chat services as a generic information bus, passing SPARQL queries and results via user accounts, encoded as XMPP IQ messages.

Jabber bot dialog showing sparql

JQbus provides a Jabber (XMPP) transport for SPARQL queries and responses, using Java. Each question comes "from" some Jabber account (possibly shared with a human user) and is routed by Jabber magic to code attached to another such account, whose response is transmitted back, addressed only to a jabber: ID.

What does it do?

Given a pair of Jabber accounts capable of exchanging messages (and in particular, custom IQ queries), we provide basic glue code that handles the passing of SPARQL queries and their corresponding XML-encoded responses.

The Jabber layers, in turn, take care of ugly details such as authentication, buddylists, getting messages through NAT/firewalls.

Documentation

See also the README.txt for some getting-started specifics, CHANGES.txt for progress reports, and the javadoc for API detail. This is experimental code; assume nothing works, and maybe you'll be pleasantly suprised. This code began with a conversation with Peter Saint-Andre about binding SPARQL to Jabber, and an implementation I started and Leigh Dodds kindly helped clean up. Chris Schmidt has a nice implementation of Peter's original design using Python/Redland. This Java implementation uses a slightly different binding.

Thanks to Peter Saint-Andre and Dirk-Willem van Gulik for help with the original Jabber/XMPP design, Chris for getting a Python implementation out the door before I even finished this one, and especially to Leigh for refactoring the original code into something more sane. If we can sync several implementations of the basic concept, I'll try to update http://www.w3.org/2005/09/xmpp-sparql-binding and investigate possibilities for a W3C Note.

JQbus uses the HP Labs Jena library for RDF support (including the ARQ SPARQL engine), and the Ignite Realtime (Jive software) Smack library for XMPP/Jabber support. SPARQL queries are defined by the W3C sparql spec alongside an XML response format - this software depends upon both. W3C also has specs for a SPARQL protocol, and a JSON response format. This work does not currently pay close enough attention to the SPARQL Protocol specification, and offers no support for JSON or other result format bindings. These are natural areas for further work. There is also a WSDL representation of sparql-protocol; however the expectation behind this work is that Jabber offers enough specialist facilities (discovery, rosters, etc.) for a custom rather than machine generated interface to be worthwhile, even though there may be automatic WSDL-to-XMPP mappings. Again, an area for investigation.

Familiar Looking Cloud Diagram

JQbus diagram

We've all been drawing such diagrams for years. The justification for this one is as a way to explain how "Semantic Desktop" efforts might plug into a story for broader access to personal data on the Semantic Web. Loosly: efforts like Nepomuk-KDE and Gnowsis are providing an RDF view of much desktop data, through which descriptions of photos, audio/video, calendars, addressbooks etc might be made available - selectively - to friends, colleagues and software working on behalf of those people. JQbus is an experimental toy for exposing desktop data in this fashion, using off-the-shelf Jabber accounts. The other services sketched in the diagram are part of a bigger unwritten story for how this infrastructure facilitates decentralised social networks. Short version: social networks are about people, not commercial websites. They can't be bought nor sold, nor fenced in :) But anyway...

JQbus and other such systems put us in a design space where we can have an incoming message, perhaps arriving at a personal desktop. We know that the message carries a SPARQL query, we know the Jabber ID of the party sending it. And we have have an easy way to send a response. It provides no direct access control machinery, although we could check to see if the requester is on our buddylist roster.

Protocol Overview

From RFC-3920:

Info/Query, or IQ, is a request-response mechanism, similar in some ways to HTTP. The semantics of IQ enable an entity to make a request of, and receive a response from, another entity. The data content of the request and response is defined by the namespace declaration of a direct child element of the IQ element, and the interaction is tracked by the requesting entity through use of the 'id' attribute.

How does it look?

A Jabber (ie. XMPP) client maintains an XML-based conversation with a service connection. It is something like a never-ending streamed XML document. Here is an "IQ" stanza within such a conversation, from the point of view of the sender:

<iq id="S3IG2-4" to="danbri@livejournal.com/sparqlserver" type="get">
<query xmlns='http://www.w3.org/2005/09/xmpp-sparql-binding'>
PREFIX foaf: &lt;http://xmlns.com/foaf/0.1/&gt; SELECT DISTINCT ?o WHERE {?s foaf:name ?o.}
</query>
</iq>

Here is how that looks to the receiving party, in their conversation with their Jabber service:

<iq id="40z5D-4" to="danbri@livejournal.com/sparqlserver" from="bandri@livejournal.com/sparqlclient" type="get">
<query xmlns="http://www.w3.org/2005/09/xmpp-sparql-binding">
PREFIX foaf: &lt;http://xmlns.com/foaf/0.1/&gt; SELECT DISTINCT ?o WHERE {?s foaf:name ?o.}
</query>
</iq>

Note I'm testing with two LiveJournal accounts, here "bandri" is asking questions of "danbri"; it should be possible to use jabber.org, gmail/gtalk and other providers, so long as the Jabber servers are federated fully. You can see that the data is pretty much unchanged, except that a different stream-specific id is used in each. The id serves to tie together a conversation across various XML elements, locally between a Jabber client and its service provider. Looking to the response format, again from the querying party's perspect, we see:

<iq id="40z5D-4" to="bandri@livejournal.com/sparqlclient" 
  from="danbri@livejournal.com/sparqlserver" type="result">
<query-result xmlns="http://www.w3.org/2005/09/xmpp-sparql-binding">
<meta comments="content generated via DOM conversion."/>
<sparql xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
	xmlns:xs="http://www.w3.org/2001/XMLSchema#" 
	xmlns="http://www.w3.org/2005/sparql-results#">
  <head>
    <variable name="o"/>
  </head>
  <results>
    <result>
      <binding name="o">
        <literal>Libby Miller</literal>
      </binding>
    </result>
    <result>
      <binding name="o">
        <literal>Tim Berners-Lee</literal>
      </binding>
    </result>
<!-- ... more bindings here -->
  </results>
</sparql>
</query-result>
</iq>

Note: the markup here is as specified for the XML results format, but embedded within a broader protocol context. The initial design used <query> elements for both the question and response. The current design uses different XML element names; this was largely motivated by implementation pragmatics w.r.t. the Smack library and the way it attaches custom handlers to IQ messages. But it also makes some sense intuitively; a response is not a query.

The Python code at http://crschmidt.net/semweb/sparqlxmpp/ uses a slightly different binding:


	<iq to='crschmidt@crschmidt.net/sparql' type='result' id='2' from='crschmidt@crschmidt.net/sparql'>
	<query xmlns='http://www.w3.org/2005/09/xmpp-sparql-binding'>
	<meta />
	<sparql xmlns='http://www.w3.org/2001/sw/DataAccess/rf1/result'>
	  <head> ...

Chris' code uses 'query' subelements of 'iq' for both request and response (the underlying jabber.py encourages this), and the SPARQL results namespace needs updating.

This diagram gives an overview of the Jabber conversations undertaken by an installation playing the "server" role; since it is a real Jabber account, other things can also be seen. This gives an idea of the data environment in which this code finds itself (eg. presence info from friends on the roster aka buddylist).

JQbus server log

Code

There is Java src in subversion in Subversion. The downloads area should have latest code bundle.

Even stalkers and historians have better things to do than read the pre-Subversion cvs logs. Forgive me for not migrating them.

Quick Start

Read the README.txt, althought it may lag behind the code. Or if you're feeling lucky, try running these scripts:

wget -nd http://svn.foaf-project.org/foaftown/downloads/jqbus-latest.tar.gz
tar -zxvf jqbus-latest.tar.gz 
cd jqbus/

# please substitute your own Jabber accounts here! :)

# to run in server mode:
ant -Dfoaftown.role=server -Dfoaftown.pwd=yyyy -Dfoaftown.my_jid=danbri@livejournal.com

# to run in client mode (we have to name a server JID too):
ant -Dfoaftown.role=client -Dfoaftown.pwd=xxxx -Dfoaftown.my_jid=danbrickley@gmail.com \
	 -Dfoaftown.other_jid=danbri@livejournal.com


# Gasp in amazement as stuff scrolls up the screen and debug windows show you the 
  XMPP conversations from point of view of server and/or client.
# ...

Limitations

At the moment, JQbus provides only minimalist glue code. The Jabber-level autodiscovery is not yet implemented. There is no access control. Account details are hard coded. There is no roster inspection. RDF datasources are hardcoded, and loaded into a flat triple-space instead of into named graph contexts. We're still at the proof of concept stage, basically. But it's a start!

Future work areas...

Contact

Dan Brickley