Thursday, January 11, 2018

Narayana periodic recovery of XA transactions

Let's talk about the transaction recovery with details specific to Narayana.
This blog post is related to JTA transactions. If you configure recovery for JTS, still you can find a relevant information here but then you will need to consult the Narayana documentation.

What is the transaction recovery

The transaction recovery is process needed when an active transaction fails for some reason. It could be a crash of process of the transaction manager (JVM) or connection to the resource (database)could fail or any other reason for failure.
The failure of the transaction could happen at various points of the transaction lifetime and the point define the state which the in-progress transaction was left at. The state could be just an in-memory state which is left behind and transaction manager relies on the resource transaction timeout to release it. But it could the transaction state after prepare was called (by successful prepare call the 2PC transaction confirms that is capable to finish transaction and more of what it promises to finish the transaction with commit). There has to be a process which finishes such transaction remainders. And that process is the transaction recovery.
Let's review three variants of failures which serves three different transaction states. Their results and needs of termination will guide us through the work of the transaction recovery process.

Transaction manager runs a global transaction which includes several transaction branches (when using term from XA specification). In our article about 2PC we used (not precisely) term resource-located transaction instead of the transaction branch.
Let's say we have a global transaction containing data insertion to a database plus sending a message to a message broker queue.
We will examine the crash of transaction manager (the JVM process) where each point represents one of the three example cases. The examples show the timeline of actions and differ in time when the crash happens.

  1. The global transaction was started and insertion to database happened, now JVM crashes. (no message was sent to queue). In this situation, all the transaction metadata is saved only in the memory. After JVM is restarted the transaction manager has no notion of existence of the global transaction in time before.
    But the insertion to the database already happened and the database has some work in progress already. But there was no promise by prepare call to end with commit and everything was stored only as an in-memory state thus transaction manager relies on the database to abort the work itself. Normally it happens when transaction timeout expires.
  2. The global transaction was started, data was inserted into the database and message was sent to the queue. The global transaction is asking to commit. The two-phase commit begins – the prepare is called on the database (resource-located transaction). Now the transaction manager (the JVM) crashes. If an interleaving data manipulation would be permitted then the 2PC commit would fail. But calling of prepare means the promise of the successful end. Thus the call of prepare causes locks to be taken to prevent other transactions to interleave.
    When transaction manager is restarted but again no notion of the transaction could be found as all the in-memory state was cleared. And there was nothing to be saved in the Narayana transaction log so far.
    But the database transaction is in the prepared state and with locks. On top of it, the transaction in the prepared state can't be rolled-back by transaction timeout and needs to wait for some other party to finish it.
  3. The global transaction was started, data inserted into the database and message was sent to the queue. The transaction was asked to commit. The two-phase commit begins – the prepare is called on the database and on the message queue too. A record success of the prepare phase is saved to the Narayana transaction log too. Now the transaction manager (JVM) crashes.
    After the transaction manager is restarted there is no in-memory state but we can observe the record in the Narayana transaction log and that database and the JMS queue resource-located transactions are in the prepared state with locks.

In the later cases the transaction state survived the JVM crash - once only at the side of locked records of a database, in other cases, a record is present in transaction log too. In the first case only in memory transaction representation was used where transaction manager is not responsible to finish it.
The work of finishing the unfinished transactions belongs to the recovery manager. The purpose of the recovery manager is to periodically check the state of the Narayana transaction log and resource transaction logs (unfinished resource-located transactions – it runs the JTA API call of XAResource.recover().
If an in-doubt transaction is found the recovery manager either roll it back - for example in the second case, or commit it as the whole prepare phase was originally finished with success, see the third case.

Narayana periodic recovery in details

The periodic recovery is the configurable process. That brings flexibility of the usage but made necessary to use proper settings if you want to run it.
We recommend checking the Narayana documenation, the chapter Failure Recovery.
The recovery runs periodicity (by default each two minutes) - the period could be changed by setting system property RecoveryEnvironmentBean.periodicRecoveryPeriod). When launched it iterates over all registered recovery modules (see Narayana codebase com.arjuna.ats.arjuna.recovery.RecoveryModule) and it runs the following sequence: calling the method periodicWorkFirstPass on all recovery modules, waiting time defined by RecoveryEnvironmentBean.recoveryBackoffPeriod, calling the method RecoveryEnvironmentBean.recoveryBackoffPeriod on all recovery modules.
When you want to run standard JTA XA transactions (JTS differs, you can check the config example in the Narayana code base) then you needs to configure the XARecoveryModule for the usage. The XARecoveryModule then brings to the play need of configuring XAResourceOrphanFilters which manage finishing in-doubt transactions when available only at the resource side (the second case represents such scenario).

Narayana periodic recovery configuration

You may ask how all this is configured, right?
The Narayana configuration is held in "beans". The "beans" contains properties which are retrieved by getter method calls all over the Narayana code. So configuration of the Narayana behaviour means redefining values of the demanded bean properties.
Let's check what are the beans relevant for setting the transaction management and recovery for XA transactions. We will use the jdbc transactional driver quickstart as an example.
The releavant beans are following

To configure the values of the properties you need to define it one of the following ways
  • via system property – see example in the quickstart pom.xml
    . We can see that the property is passed at the JVM argument line.
  • via use of the descriptor file jbossts-properties.xml.
    This is usually the main source of configuration in the standalone applications using Narayana. You can see the example jbossts-properties.xml and observe that as the standalone application is not the exception.
    The descriptor has to be at the classpath for the Narayana will be able to access it.
  • via call of bean setter methods.
    This is the programatic approach and is normally used mainly in managed environment as they are application servers as WildFly is.

  • The usage of the system property has precedence over use of the descriptor jbossts-properties.xml.
  • The usage of the programatic call of the setter method has precedence over use of system properties.

The default settings for the used narayana-idlj-jts.jar artifact can be seen at https://github.com/jbosstm/narayana/blob/master/ArjunaJTS/narayana-jts-idlj/src/main/resources/jbossts-properties.xml. Those are (with combination of settings inside of particular beans) default values used when you don't have any properties file defined.
For more details on configuration check the Narayana.io documentation.


If you want to use the programatic approach and call the bean setters you need to gain the bean instance first. That is normally done by calling a static method of PropertyManager. There are various of them depending what you want to configure.
The relevant for us are:

We will examine the programmatic approach at the example of the jdbc transactional driver quickstart inside of the recovery utility class where property controlling values of XAResourceRecovery is reset in the code.
If you search to understand what should be the exact name of the system property or entry in jbossts-properties.xml the rule of thumb is to take the short class name of the bean, add the dot and the name of the property at the end.
For example let's say you want to redefine time period for the periodic recovery cycle. Then you need to visit the RecoveryEnvironmentBean, find the name of the variable – which is periodicRecoveryPeriod. By using the rule of thumb will use the name RecoveryEnvironmentBean.periodicRecoveryPeriod for redefinition of the default 2 minutes value.
Some bean uses annotations @PropertyPrefix which offers other way of naming for the property for settings it up. In case of the periodicRecoveryPeriod we can use system property with name com.arjuna.ats.arjuna.recovery.periodicRecoveryPeriod to reset it in the same way.

Note: an interesting link could be the settings for the integration with Tomcat which useses programatic approach, see NarayanaJtaServletContextListener.

Thinking about XA recovery

I hope you have a better picture of recovery setup and how that works now.
The XARecoveryModule has the responsibility for handling recovery of 2PC XA transactions. The module is responsible for committing unfinished transactions and for handling orphans by running registered XAResourceOrphanFilter.

As you could see we configured two RecoveryModules – XARecoveryModule and AtomicActionRecoveryModule in the jbossts-properties.xml descriptor.
The AtomicActionRecoveryModule is responsible for loading resource from object store and if it is serializable and as the whole saved in the Narayana transaction log then it could be deserialized and used immediately during recovery.
This is not the case often. When the XAResource is not serializable (which is hard to achieve for example for database where we need to have a connection to do any work) the Narayana offers resource initiated recovery. That requires a class (a code and a settings) that could provide XAResources for the recovery purposes. For getting the resource we need a connection (to database, to jms broker...). The XARecoveryModule uses objects of two interfaces to get such information (to get the XAResources for recovery).
Those interfaces are

Both interfaces then contain method to retrieve the resources (XAResourceRecoveryHelper.getXAResources(),XAResourceRecovery.getXAResource()). The XARecoveryModule then ask all the received XAResources to find in-doubt transactions (by calling XAResource.recovery()) (aka. resource located transactions).
The found in-doubt transactions are then paired with transactions in Narayana transaction log store. If the match is found the XAResource.commit() could be called.
Maybe you wonder what both interfaces are mostly the same – which kind of true – but the use differs. The XAResourceRecoveryHelper is designed (and only available) to be used in programatic way. For adding the helper amogst other ones you need to call XARecoveryModule.addXAResourceRecoveryHelper(). You can even deregister the helper by method call XARecoveryModule.removeXAResourceRecoveryHelper.
The XAResourceRecovery is configured not directly in the code but via property com.arjuna.ats.jta.recovery.XAResourceRecovery. This is not viable for dynamic changes as in normal circumstances it's not possible to reset it – even when you try to change the values by call of JTAEnvironmentBean.setXaResourceRecoveryClassNames().

Running the recovery manager

We have explained how to configure the recovery properties but we haven't pointed down one important fact – in the standalone application there is no automatic launch of the recovery manager. You need manually to start it.
A good point is that's quite easy (if you don't use ORB) and it's fine to call just


      RecoveryManager manager = RecoveryManager.manager();
      manager.initialize()
    
This runs an indirect recovery manager (RecoveryManager.INDIRECT_MANAGEMENT) which spawns a thread which runs periodically the recovery process. If you feel that you need to run the periodic recovery in times you want (periodic timeout value is then not used) you can use direct management and call to run it manually

      RecoveryManager manager = RecoveryManager.manager(RecoveryManager.DIRECT_MANAGEMENT);
      manager.initialize();
      manager.scan();
    
For stopping the recovery manager to work use the terminate call.

      manager.terminate();
    

Summary

This blog post tried to introduce process of transaction recovery in Narayana.
The goal was to present settings necessary to be set for the recovery would work in an expected way for XA transactions and shows how to start the recovery manager in your application.

Recovery of Narayana jdbc transactional driver

The post about jdbc transactional driver introduced ways how you can start to code with it. The post talks about enlisting JDBC work into the global transaction but it omits the topic of recovery.
And it's here where using of transactional driver brings another benefit with ready to use approaches to set up the recovery. As in the prior article we will work with the jbosstm quickstart transactionaldriver-standalone.

After reading the post about transaction recovery you will find out that for the recovery manager would consider any resource that we enlist into the global transaction we have to:

  • either ensure that resource could be serialized into Narayana transaction log store (resource has to be serializable), in which case the recovery manager deserialize the XAResource and use it directly to get data from it
  • or to register recovery counterpart of the transaction enlistment which is capable to provide the instance of XAResource to the RecoveryModule
    in other words we need implemntation of XAResourceRecoveryHelper or XAResourceRecovery (and that's where transactional driver can help us).

For configuration of the recovery properties, we use the jbossts-properties.xml descriptor in our quickstart. We leave the most of the properties with their default values but we still need to concentrate to set up the recovery.
You can observe that it serves to define recovery modules, orphan filters or xa resource recovery helpers. Those are important entries for the recovery works for our transaction JDBC driver.
For better understanding what was set I recommend to check the comments there.

JDBC transaction driver and the recovery

In the prior article about JDBC driver we introduced three ways of providing connection data to the transaction manager (it then wraps the connection and provides transactionality in the user application). Let's go through the driver enlistment variants and check how the recovery can be configured for them.

XADataSource provided within Narayana JDBC driver properties

This is a variant where XADatasource is created directly in the code and then the created(!) instance is passed to the jdbc transactional driver.
As transactional driver receives the resource directly it does not have clue how to create such resource on its own during recovery. We have to help it with our own implementation of the XAResourceRecovery class. That has to be of course registered into environment bean (it's intentionally commented out as for testing purposes we need to change different variants).



XADataSource bound to JNDI

This variant bound the XADatasource to JNDI name. The recovery can lookup the provided jndi name and receive it to create an XAConnection to find the indoubt transactions at database side.
The point here is to pass information about jndi name to the recovery process for it knowing what to search for.
The jdbc driver uses a xml descriptor for that purpose in two variants


In fact, there is no big difference between those two variants and you can use whatever fits you better. In both versions you provide the JNDI name to be looked-up.

XADataSource connection data provided in properties file

The last variant uses properties file where the same connection information is used already during resource enlistment. And the same property file is then automatically used for recovery. You don't need to set any property manually. The recovery is automatically setup because the placement of the properties file is serialized into object store and then loaded during recovery.


In this case you configured the PropertyFileDynamicClass providing datasource credentials for the transaction manager and the recovery too. If you would like to extend the behaviour you can implement your own DynamicClass (please consult the codebase). For the recovery would work automatically you need to work with the RecoverableXAConnection.


Summary

There is currently available three approaches for the setting up recovery of jdbc transactional driver
creation of your own XAResourceRecovery/XAResourceRecoveryHelper which is feasible if you want to control the creation of the datasource and jdbc connection on your own. Or using one of the prepared XAResourceRecovery classes - either JDBCXARecovery or BasicXARecovery where you provide xml file where you specify the JNDI name of the datasource. The last option is to use properties file which defines credentials for connection and for recovery too.