Transaction Design and Management Options

Here are some general remarks about how to handle transactions in web-based applications with a particular relevance to the pipeline server.

A transaction is a unit of work that changes application state - whether on disk, in memory or in a database - that, once started, is completed entirely, or not at all.

Transactions can be demarcated - started, and ended with a commit or rollback - by the EJB container (like Tomcat), by bean code, or by client code.

Statement of the problem

Pipeline server needs to submit several tasks at the same time (concurrent processing). The tasks can have its own set of subtasks and processes attached to them. Each task is associated with several database tables that it reads and writes to, in order to complete. The reads and writes to database are performed using SQL transactions through JDBC connection.

Requirement 1. Pipeline server needs to manage task and subtask creation as a single transaction. This is an application level transaction.
Requirement 2. We require that the JDBC connection does not perform the autocommit for each of the connection's statement execution, because we assume that one task transaction will require several database updates as a logical group (requirement 1).

The pipeline server scheduler will initiate several threads of execution of these tasks at any given time. Each of the tasks is accessing database and modifying database tables at the application level and the task submission can fail at any level before a final commit on database table updates is called. In this case the pipeline server needs to perform the rollback on the thread's initiated set of changes.

One of the basic questions is what happens if one thread initiates updates, and than performs the rollback at the application level, while the other thread performs a successful commit at the application level. Is there a potential for a deadlock and in what state is the database left?

Requirement 3. The thread to succeed first in the application level commit will promote a new state of the database and the thread that failed needs to perform the rollback at the application level, and can not change the state of the database until it completes a new sucessful transaction.
Requirement 4. All the methods dealing with database statements of the thread need to either all succeed or they all fail collectively.

Container-Managed Transactions

Demarcating transactions at the EJB server level is the most efficient. Transactions are costly application resources, especially database transactions, because they reserve a network connection for the duration of the transaction. In a multi-tiered architecture with database, application server, and web layers you optimize performance by reducing the network traffic "round trip." The best approach is to start and stop transactions at the application server level, i.e., in the EJB container.

Container-Managed Transactions (CMT)s are simpler to develop and perform well. Container-managed transactions are supported by all bean types: session, entity, and message-driven. They provide good performance, and simplify development because the enterprise bean code does not include statements that begin and end the transaction.

Each method in a CMT bean can be associated with a single database transaction, but does not have to be. In a container-managed transaction, the EJB container manages the transaction, including start, stop, commit, and rollback. Usually, the container starts a transaction just before a bean method starts, and commits it just before the method exits.

If a system exception is thrown during a transaction, the container will automatically roll back the transaction. You can also explicitly program rollbacks in your bean.

Bean-Level Transaction Management

In a bean-managed transaction, the EJB code manages the transaction, including start, stop, commit, and rollback. Bean-managed transactions are supported by all session and message-driven beans; you cannot use bean-managed transactions with entity beans.

When to Use Bean-Managed Transactions

These are examples of requirements that may dictate the use of bean-managed transactions:

You need to define multiple database transactions with a single method call. With container - managed transactions, a method can only be associated with a single transaction. You can use a bean-managed transaction to define multiple transactions with a single method. However, it is worth avoiding the need for a bean-managed transaction by breaking the method in to multiple methods, each with its own container-managed transaction.

We need to define a single transaction that spans multiple EJB method calls. For example, a stateful session EJB that uses one method to begin a transaction, and another method to commit or roll back a transaction. It is best to avoid this practice, because it requires detailed information about the workings of the EJB object. However, if this scenario is required, we must use bean-managed transaction coordination, and we must coordinate client calls to the respective methods.

Client-Level Transaction Management is Costly

Client applications are subject to interruptions or unexpected terminations. If you start and stop a transaction at the client level, you risk:

Consumption of network resources during waits for user actions, interruptions, until resumption of client activity or timeout.
Consumption of processing resources and network resources to rollback the transaction after timeout or termination of the transaction.

In conclusion: it is better not manage transactions in client applications unless there are overriding reasons to do so.

Proposed Design

If the requirements above need to be implemented, the proposed design is to use the 'Transaction' design pattern.

The purpose of the transaction pattern is to group a collection of methods so that they either all succeed or they all fail collectively.

One recalls a famous example of transferring funds from one bank account to another. They either need both to succeed or both need to fail. Thus we expect several methods to be fully synchronized and a recovery option should be available. The main players in the pattern are:


/** 
  * The interface that defines all the methods to control every participant 
  */

interface TransactionParticipant {
    boolean join(long transactionID);
    void commit(long transactionID);
    void cancel(long transactionID);
}


/** 
  * interface that defines the business methods. Methods can throw Exceptions
  * as a signal of failure.
  */
  
interface SpecificParticipant extends TransactionParticipant {
     boolean operation1(long transactionID);
     boolean operation2(long transactionID);
     ...
}

/**
  * implements all the interface methods and defines what happens if the
  * transaction manager decides to cancel() or commit(). It has to keep a
  * reference to the original state, to be able to restore when the cancel
  * is invoked (Memento pattern is recommended).
  */
  
public class ConcreteParticipant implements SpecificParticipant {
   void cancel();
   void commit();
   ....
}

/**
  * Call the join method on the participants and ultimately calls either cancel
  * or commit on the participants.
  */
public class TransactionManager {
...
}

We use the transaction as follows:

Create a transactionID (either as a long or as an object)
Invoke join on all participants and abort transaction if the joining fails for any of the participants.
Try the action, invoke the business methods and call cancel as soon as any participant fails.
If the action is completed, call commit on all participants.