Tuesday, April 26, 2011

almost done

JBossTS 4.15.0.Final is done. It's the fastest, most robust release yet. It's also the last feature release of its line.

As well as working standalone, 4.15 is the feature release for integration with JBossAS 7 and thence JBossEAP 6. As AS 7 is still a moving target we've got a bit of tweaking to do for the integration over the next few weeks, particularly on JTS and XTS. After that it's maintenance only on the 4.15.x branch and the community focus moves on to the next major version.

As well as new features, the next major release cycle will have a new project name (complete with snazzy new logo, naturally) and a new project lead.

That change should hopefully leave me with some time to do new things too, particularly looking at the implications of the cloud computing model and NoSQL storage solutions on transaction systems.

I'm starting out on that road with a presentation next week at JUDCon so if you're in the Boston area come along and join in. Don't worry, I'll have written the slides by then, I promise. I'm just not quite done yet :-)

Wednesday, April 20, 2011

Messaging/Database race conditions

Programmers are divided between those who only write single threaded code and those who will, at the slightest hint of provocation, tell interminably long war stories about debugging race conditions. Although I'm firmly in the second category, this is not such a story. It is however a lesson in what happens when spec authors spend insufficient time listening to them.

A common pattern in enterprise apps involves a business logic method writing a database update, then sending a message containing a key to that data via a queue to a second business method, which then reads the database and does further processing. Naturally this involves transactions, as it is necessary to ensure the database update(s) and message handling occur atomically.

methodA {
beginTransaction();
updateDatabase(key, values);
sendMessage(queueName, key);
commitTransaction();
}

methodB {
beginTransaction();
key = receiveMessage(queueName);
values = readDatabase(key);
doMoreStuff(values);
commitTransaction();
}

So, like all good race condition war stories, this one starts out with a seemingly innocuous chunk of code. The problem is waiting to ambush us from deep within the transaction commit called by methodA.

Inside the transaction are two XAResources, one for the database and one for the messaging system. The transaction manager calls prepare on both, get affirmative responses and writes its log. Then it calls commit on both, at which point the resource managers can end the transaction isolation and make the data/message visible to other processes.

Spot the problem yet?

Nothing in the XA or JTA specifications defines the order in which XAResources are processed. The commit calls to the database and message queue may be issued in either order, or even in parallel.

Therefore it is inevitable that every once in a while, probably on the least convenient occasions, the database read in methodB will fail as the data has not yet been released from the previous transaction.

Oops.

So, having been let down by shortcomings in the specifications, what can the intrepid business logic programmer do to save the day?

Polling: Throw away the messaging piece and construct a loop that will look for new data in the db periodically. yuck.

Messaging from afterCompletion: The transaction lifecycle provides an afterCompletion notification that is guaranteed to be ordered after all the XAResource commits complete, so we can send the message from there to avoid the race. Also icky, since it is then no longer atomic with the db write in the case of failures, which means we need polling as a backup anyhow.

Let it fail: An uncaught exception in the second tx, e.g. an NPE from the database read, should cause the container to fail the transaction and the messaging system to do message redelivery. Hopefully by that time the db is ready. Inefficient and prone to writing a lot of failure messages into your system logs, which is to say: yuck.

Force resource ordering: Many transaction systems offer some non-spec compliant mechanism for XAResource ordering. Due to the levels of abstraction the container places between the transaction manager and the programmer this is not always easy to use from Java EE. It also ties your code to a specific vendor's container. You can do this with JBoss if you really want to, but it's not recommended.

Delay the message delivery: Similar to the above, some messaging systems go above and beyond the JMS spec to offer the ability to mark a queue or specific message as having a built-in delay. The messaging system will hold the message for a specified interval before making it available to consumers. Again you can do with with JBoss (see 'Scheduled Messages' in HornetQ) but whilst it's easy to code it's not a runtime efficient solution as tuning the delay interval is tricky.

Backoff/Retry in the consumer: Instead of letting the consuming transaction fail and be retried by message redelivery, catch the exception and handle it with a backoff/retry loop:

methodB {
beginTransaction();
key = receiveMessage(queueName);

do {
values = readDatabase(key);
if(values == null) {
sleep(x);
}
} while(values == null);

doMoreStuff(values);
commitTransaction();
}

This has the benefit of being portable across containers and, with a bit of added complexity, pretty robust. Yes it is a horrible kludge, but it works. Sometimes when you are stuck working with an environment built on flawed specifications that's really the best you can hope for.

Monday, April 4, 2011

World domination

It's nearing that time of year again... The transactions team are packing their bags and preparing to spend a couple of days holed up in a top secret underground lair, plotting the next steps towards world domination.

I normally spend a few days beforehand checking up on the state of the world, particularly new developments in transactions, the list of open issues in our bug tracker, the last few month's worth of support cases and other such sources of interesting information. I'm starting to think the reason that no Evil Genius has succeeded in taking over the world yet has very little to do with being repeatedly thwarted by James Bond and an awful lot to do with the sheer volume of paperwork involved. Perhaps if they spent less on underground lairs and more on secretarial support...

Anyhow, as per usual we have way too many ideas on the list, so some prioritization is in order. One of the key factors is the number of users a given chunk of work may benefit. Improvements to the crash recovery system benefit pretty much everyone. I'd like to say the same about documentation, but it's apparent only a tiny proportion of users actually read it. Somewhere in between lie improvements to optional modules or functionality that only some users take advantage of. The tricky part is gauging how many there are.

Users occasionally get in touch and tell us how they are using the code. Unfortunately this usually happens only when they need help. The rest of the time we mostly just make educated guesses about the use cases. I was mulling over this problem today when a tweet caught my eye:

"Today's royalty check for "Advanced CORBA Programming": $24.51. Last year each check was ~$400. I guess CORBA is officially dead now :-)" - Steve Vinoski

This got me thinking about the future of JTS. In Java EE environments, JTS exists pretty much only to allow transactional calls between EJBs using IIOP. Now that we have @WebService on EJBs, how much traffic still uses IIOP anyhow? What are the odds of RMI/IIOP (and hence JTS) being deprecated out of Java EE in the next revision or two? How much time should we be sinking into improvements of JTS vs. web services transactions?

That's just one of many topics on the agenda for our team meeting. I suppose we could also debate the relative merits of man-portable freeze rays vs. orbital laser cannons, but I don't think the corporate health and safety folks will let us have either. Spoilsports. Guess we'll just have to continue progressing towards world domination the hard way.