Friday, October 8, 2010

Clustering J2EE Applications

All mission critical applications need to be have high Availability and Scalability features built-in. Any incident or outage in such applications can have huge implications – like business loss or legal issues faced by the company. Clustering helps in application scalability (through load balancing) as well as high availability (through failover).

Clustering J2EE Applications
Never assume that stand-alone applications can be transmit transparently to a cluster structure.
Most of the leading application server vendors like IBM (Websphere) and Oracle (BEA Weblogic) support clustering their servers and provide for built-in load balancing among them.

It might seem that Clustering is a deployment or server related activity and we as normal Java developers might not be concerned with it. This is not true. There are various aspects which the Java developers need to take care while working on design, coding and testing for creating cluster aware, scalable J2EE applications.

Figure 1: Visualize Clustering

Application Design and Coding Considerations
Below are some basic points to consider while developing and maintaining applications - which will ultimately be deployed in a clustered environment in production.

User Session
HttpSession object is responsible for maintaining user session across requests. Session object is tracked by a unique session id.

To provide transparent failover in a cluster, application servers need to ‘replicate’ session state on multiple servers. As a result, based on the vendor implementation, session id might vary across client requests.
  • ‘session id’ should not be used in an application for operations synonymous to user id or should not be used to mark any transaction because it may not remain same throughout the user session. For such purposes, user id would be a more appropriate choice.
  • All objects stored in the session must be made Serializable. If any Java object, say Hashtable or any user defined object, is to be kept in session then that object must implement interface.
  • Object serialization and de-serialization are very costly in performance especially in the database persistent approach. In clustered environment, storing large or numerous objects in the session should be avoided.
  • Updating data in HttpSession - While updating any session attribute, use HttpSession.setAttribute() explicitly. This call will make sure that updated object is reflected in all session copies maintained on different servers by the application server.
  • Setting and getting values to-from HttpSession – Deprecated APIs getValue(), putValue(), removeValue() should not be used. Instead getArribute(), setAttribute() and removeAttribute() should be used.
Static Variables and Singletons
  • Do not use static variables for any logic/data that is at an application level. Simple reason being that a variable value will not be available to the JVMs in other machines and next client request can go to any server.
  • Avoid use of Singleton class that encapsulates logic for whole application (across JVMs in a cluster). In a single machine on one JVM singleton class will work. But in a cluster with multiple JVMs, having single instance will not be possible.
Data Caching
  • Application may have data cached in memory from secondary storage like database. It would be much faster to access reference data and data that doesn’t change frequently from memory.
  • Synchronize Cache - In a clustered environment, each JVM instance will maintain its own copy of the cache, which should be synchronized with others to provide consistent state in all server instances. Sometimes this kind of sync will bring worse performance than no caching at all.
File Access

Although not recommended by the J2EE specification, the external I/O operations are used for various purposes.
  • Since components will be deployed across machines in a cluster, file system must be accessible in a uniform way to all the machines.
  • To achieve this, either common network file system (like SAN) could be mounted or
  • Files could be replicated on all the machines in a cluster.
  • Instead of a file, maintain required information in the database.
  • Application logging - If application uses file to log messages then evaluate whether single file per cluster or separate file per server in cluster is required to be maintained.
Other Services
  • There are some functionalities and services which makes sense in stand-alone mode only. For example Timers/Schedulers and Email Notifications Services. These services are trigged by events instead of requests, and should only be executed only once. These services are hard to migrate to a cluster environment.
  • Some products have prepared for such services. For example, JBoss uses “clustered singleton facility” to coordinate all the instances to guarantee to execute these services once and only once. Spring Quartz (batch scheduler API) also provides clustering support.
Third Party Software

In an enterprise application usually other software/tools are used like Rule Engines, OR Mapper, third party cache manager, messaging software, logging component etc. Some of these third party software may not support clustering. This should be looked upon at design time before selecting these tools.

Application Testing Considerations

Test plans should be created to test application in cluster environment. Test cases should cover cluster objectives, which are load balancing, scalability and failover. To test cases, it might be required to create test stubs. Test cases should also cover any application specific and cluster sensitive services like file access, static data caching, and Singleton.

Brief Terminology

Scalabilty - Scalability refers to a system’s ability to support fast increasing numbers of users. One way to is to add resources (memory, CPU or hard disk) to a server. Clustering allows a group of servers to share the heavy tasks, and operate as a single server logically.
High Availability – Single server solution to scalability (adding more resources) is not a robust one because of its single point of failure. It is required that mission critical services are accessible with reasonable/predictable response times at any time. Clustering is a solution to achieve this kind of high availability by providing redundant servers in the cluster in case one server fails to provide service.
Load balancing - Load balancing is one of the key technologies behind clustering, which is a way to obtain high availability and better performance by dispatching incoming requests to different servers. In addition the load balancer should perform other important tasks such as “session stickiness” to have a user session live entirely on one server and “health check” to prevent dispatching requests to a failing server.
Fault Tolerance - Highly available data is not necessarily strictly correct data. A fault tolerant service always guarantees strictly correct behavior despite a certain number of faults.
Failover - Failover is a key technology behind clustering to achieve fault tolerance. By choosing another node in the cluster, the process will continue when the original node fails.




  1. Nice and clear post with basic considerations on application development in a clustered environment

  2. nice post! Thanks Yogesh!