Saturday, January 4, 2014

Introducing "Babbler" - A new XMPP client library for Java

Abstract

I've had some spare time recently due to my winter holidays and felt like coding something in Java. It turned out I wanted to try some XMPP (standard instant messaging protocol) related stuff… something I've had in mind for quite some time, because apparently nobody has done it yet: A XMPP client library for Java based on JAXB. I call it "Babbler" due to its semantic resemblance to "Jabber" (XMPP's former name).

In the following I want to briefly tell you why, what and how I did it.

Motivation

I've been a long time Smack user. However, there are a few (fundamental) things, which I am not comfortable with. Unfortunately most of these things are embedded so deeply into Smack's core that they can't be just fixed by contributing a patch. In particular, its main drawbacks, which have kept annoying me, are:

  • Smack depends on org.xmlpull.v1.XmlPullParser etc., a library which feels like a leftover from the early 2000s; from times, when there was neither StAX nor JAXB.
    Personally I found it to be very inconvenient to write (your own) XMPP extensions or parse XML from your private XML storage. In contrast, JAXB would be so much easier to work with.
  • XML is generated by String concatenation. In my opinion an old-fashioned and error prone approach. This would also be covered by JAXB.
  • There's no class representing a JID. Every JID is a String, which often led to confusion in our development team, when working with users in general. Sometimes a String is a JID and sometimes it is only the local part (not in Smack, but in our application) and you can't really know, unless you read the documentation.
  • Smack's BOSH implementation is a mess. Let me explain. Apart from some minor bugs (e.g. no proxy support, connection exceptions do not trigger the ConnectionListener), it works! But... the messy part lies more in the big amount of dependencies, which are required:
    • jbosh-0.6.0.jar (66 KB)
    • xlightweb-2.5.jar (411 KB)
    • xSocket-2.4.6.jar (312 KB)
    • xpp3-1.1.3.3.jar (91 KB)
    • dom4j-1.6.1.jar (306 KB)
    And of course the Smack classes for the BOSH extension.
    Altogether it takes up over 1.2 MB, about four times as much as the core smack.jar!
    After spending too much time debugging through all these classes, while trying to implement proxy support for BOSH, I was convinced that it can and should go easier. All these dependencies are way too much, considering that all you actually need for BOSH to work are 2-3 classes: One class that represents the <body/> element and another class which implements the BOSH technique (by simply using java.net.HttpURLConnection); and maybe one more to represent an exception.
  • Extensions could be managed in a better way. (Static initializers, no common design, no common package: some are in the core package, some directly in the smackx package and others in their own package).

Don't get me wrong, Smack is probably the best Java XMPP library out there (haven't tried any other)! It has good documentation, a community site, supports most important XMPP extensions and there's even an Android port. The parsing limitations are obviously only due to historical reasons.

But why not try to leverage JAXB for XMPP processing? Let's see...

Implementing XMPP with JAXB

The biggest challenge with JAXB in conjunction with XMPP is, that JAXB writes namespace prefixes for each element and also adds each known namespace to the root element, even if it's not needed for a particular stanza.

This results in XML like this:

<iq xmlns:ns11="http://etherx.jabber.org/streams" xmlns:ns7="urn:ietf:params:xml:ns:xmpp-stanzas"
    xmlns:ns6="urn:ietf:params:xml:ns:xmpp-bind" xmlns:ns5="jabber:iq:roster"
    xmlns:ns4="urn:ietf:params:xml:ns:xmpp-sasl" xmlns="jabber:client" id="id" type="get">
   <ns5:query>
      <ns5:item jid="node1@domain" name="Name"><ns5:group>Group1</ns5:group></ns5:item>
   </ns5:query>
</iq>

What XMPP prefers, however, is a so-called prefix-free canonicalization as described in the specification, so that we instead have XML like this:

<iq xmlns="jabber:iq:roster" id="id" type="get">
   <query>
      <item jid="node1@domain" name="Name"><group>Group1</group></item>
   </query>
</iq>

Fortunately, this can be achieved by passing a custom XMLStreamWriter implementation (which simply does not write prefixes) to the JAXB Marshaller. It took some time to figure that out, though ;-). Take a look, if you are interested into details.

On the other hand, parsing XML is pretty straight forward by using XMLEventReader and passing it to the Unmarshaller.

The reader waits for an incoming XML stanza and is then passed to the Unmarshaller, which will then completely parse the XML stanza into a Java object, e.g. a message. Afterwards it waits for the next stanza and so on.

(As a side node, the XMLStreamReader could not be used here, because the Unmarshaller blocks the reading thread, until new XML comes over the stream. An obstructive nuisance, which has also been discovered by someone else.)

For both reading and writing XML, all you have to take care about is setting up the JAXBContext correctly, i.e. making sure, that each XMPP element (including extensions) is added to the context, so that a Java object can be marshalled to XML and XML can be unmarshalled (parsed) back to an object.

So what's the benefit after all?

It's easy: Writing XMPP extensions can now become as simple as that:

@XmlRootElement(name = "delay", namespace = "urn:xmpp:delay")
@XmlAccessorType(XmlAccessType.FIELD)
public final class DelayedDelivery {

    @XmlJavaTypeAdapter(JidAdapter.class)
    @XmlAttribute(name = "from")
    private Jid from;

    @XmlAttribute(name = "stamp")
    private Date timestamp;

    @XmlValue
    private String reason;

    private DelayedDelivery() {
    }

    public Jid getFrom() {
        return from;
    }

    public Date getTimeStamp() {
        return timestamp;
    }

    public String getReason() {
        return reason;
    }
}

(This is the implementation of XEP-0203: Delayed Delivery.)

No need to write XML. No need to parse XML. No need to escape XML content etc. JAXB does it all for you!

In my opinion this is as easy and stable as it could get.

Getting this extension from a message can then be done with:

DelayedDelivery delayedDelivery = message.getExtension(DelayedDelivery.class);

Designing the API

Besides using JAXB I had several general design goals in mind, while implementing the XMPP library:

  • Use an adequate core package: org.xmpp and use a separate sub-package for each XMPP namespace (e.g. bind, sasl, stanza, tls) and each extension.
  • Use java.util.logging to properly log exceptions and XMPP input/output (I didn't want to use a 3rd party logger only for that).
  • Use proper and existing exceptions, e.g. javax.security.auth.login.LoginException for login failures or java.util.concurrent.TimeoutException, if a timeout occurred while talking with a XMPP server.
  • Make it extensible and customizable by allowing to register (or unregister) new extensions and so-called extension managers (which are implementations of more complex XEPs like chat state notifications).
  • Use an event-driven design by using the Java Event Model.
  • Implement both core XMPP specifications (RFC 6120, RFC 6121)
  • Keep it lean and simple, e.g. by not relying on dependencies and make use of existing Java classes (e.g. for base 64 encoding or SASL negotiation).

The library's core revolves around a connection class, which allows to connect and authenticate to a XMPP server and afterwards provides methods to listen for stanzas or to send them.

Sending and receiving stanzas

Sending stanzas is simply done with:

connection.send(message);

While for receiving stanzas an event-drive approach must be used:

connection.addPresenceListener(new PresenceListener() {                                                   
    @Override                                                                                             
    public void handle(PresenceEvent e) {                                                           
        if (e.isIncoming()) {                                                                                                                     
            Presence presence = e.getPresence();                                                                                                                                      
        }                                                                                                 
    }                                                                                                     
}); 

There's also a (blocking) method for querying another XMPP entity:

IQ result = connection.query(iq);

Managing core XMPP aspects

For managing core aspects, like TLS, SASL, the roster or chat sessions, there's a corresponding manager on the connection object, e.g.:

connection.getSecurityManager().setSSLContext(customSSLContext);
connection.getAuthenticationManager().setPreferredMechanisms(preferredMechanisms);
connection.getPresenceManager().requestSubscription(Jid.valueOf("juliet@example.com"), "Please add me!");

Managing extensions

A non-trivial challenge is the way, how extensions should be managed, like chat state notifications or message delivery receipts. Most protocol extensions like these require some kind of logic, which is implemented by - I call them - extension managers.

Unlike Smack, I didn't want to make these managers keep a reference to the connection, but vice versa: The connection object keeps references to all of its known extensions managers. This way, they could be simply created, when the connection is created and garbage collected, when the connection is gc'ed, thus avoiding dealing with static initializers or weak references.

To illustrate the approach, I've thought about using extension managers like this:

LastActivityManager lastActivityManager = connection.getExtensionManager(LastActivityManager.class);

Still not sure about that approach though… comments appreciated ;-)

What about the BOSH implementation?

As mentioned, I believe a BOSH implementation should not be that bloated. Because actually it's pretty easy:

  1. You need a class which implements the <body/> wrapper element. Pretty straightforward, nothing fancy here.
  2. You need another class, which implements the BOSH technique: You need two threads in order to make two simultaneous HTTP requests. When doing a HTTP request you just put the XMPP elements (e.g. messages, presences, …) into the <body/> wrapper element, increment the request ID and send it. Back comes a response - again as <body/> wrapper element - whose content can be processed just as any other XMPP element. For the HTTP requests the java.net.HttpURLConnection class can be used.

There are some minor other things to consider, like tracking acknowledgments or terminating a BOSH session, but basically you only need to write two classes.

So what now? Is there some code?

Having all this said it's time to present some code!

I've published my code on BitBucket in a public Mercurial repository as Maven project under the MIT License.

Note that this is standard Java 7 code and also relies on APIs from Java SE, so you won't be able to use the library on Android!

Also keep in mind, that it's just an experimental project for the JAXB and BOSH approach and is therefore not (yet) intended for productive use! E.g. , at the time of this writing, most of the existing extensions are merely an almost empty package without much logic and the API will likely change.

But I'd say I succeeded in implementing 95% of the core specifications, short of some things like SCRAM-SHA-1 SASL mechanism or server certificate checking.

I'd be happy if someone tried it out and left some feedback, in spite of lack of good documentation (there are currently only JavaDocs in the download section).

Here's a simple code snippet to get you started:

Connection connection = new TcpConnection("hostname", 5222);  
connection.addMessageListener(new MessageListener() {         
    @Override                                                 
    public void handle(MessageEvent messageEvent) {           
        System.out.println("Message received.");              
    }                                                         
});                                                           
try {                                                         
    connection.connect();                                     
    connection.login("username", "password");                 
    connection.send(new Presence());                          
} catch (IOException | TimeoutException | LoginException e) { 
    e.printStackTrace();                                      
}                                                             

There's also a very simple JavaFX application in the test folder. If you have a Facebook account, you could use "chat.facebook.com" as server, 5222 as port and your FB account to log in.

That's it for now! Have fun and leave some comment!