Friday, September 08, 2006

Scaling OracleAS with Multiple JVMs

An hugely useful capability of Oracle Application Server that has been in the product for a dogs age is what is called multiple processes per OC4J instance. In OracleAS 10.1.2 you can find it hidden away in the documentation here http://download-west.oracle.com/docs/cd/B14099_11/core.1012/b14001/optj2ee.htm#CACCHACG
and here http://download-west.oracle.com/docs/cd/B14099_11/core.1012/b14003/midtiermanage.htm#CACCHFJC

In summary what it lets you do is take one configuration set of an OC4J (J2EE Server) and instantiate n - you choose the number - instances of JVMs running that configuration set. Rather than you as the administrator installing n instances of the application server or manually starting n JVMs, the application server does it for you automatically by taking the parameter you enter for number of JVMs and running the OC4J with that many JVMs.

The end result is on machines that have large numbers of CPUs, or have huge CPU capacity from multi-core technologies or appliances like what Azul provides, it is really trivial to have your application, if it is CPU intensive, suck up all those resources for a minimum of adminstrative overhead. Clearly it gives you some availability too but limited to a single machine - it seldom is used for this (or is as a by product), rather it is to optimize your hardware It is a single parameter - turn up the number of JVMs or turn them down. No install, no extra deployment. Nothing.

Personally this has always seemed like an undermarketed feature of Oracle Application Server as not only does it give "free" scalablity and works for any J2EE application, it is so easy to use relative to what I have seen elsewhere. It is hugely popular in the Oracle install base because one you see it in action it is a no-brainer to use. Here's how it works.

The following picture give a sense of what the number of processes feature does.


What you see in the outer dark grey box is an Oracle Application Server instance - this is just a process space in which OC4J runs in. For those in the know this could be thought of as an instance managed by the process manager, Oracle Process Manager (OPMN). Within that, the lighter grey box, you can see that there is a OC4J instance called OC4J_home. What that really is is the configuration set or a set of configuration files associated with this J2EE/Oc4J server - the start up parameters of the server, the J2EE applications deployed to it, the queues, topics, datasources and adapters configured. An Oc4J configuration instance, so to speak, though that is not official terminology.

The white box, in the center, is the actual runtime instances of OC4J_home configuration instance. Within it you can see that there are 4 processes. What that shows is 4 JVMs running the OC4J_home instance and using the identical configuration set underpinning OC4J_home. If I change a datasource in that OC4J_home instance configuration, all 4 OC4J_home runtime instances know about it. If I deploy a new application, all 4 OC4J_home runtime instances know about it. It do my configuration operations once and all instances pick them up.

How do I turn this on? Well in OracleAS 10.1.3.1 I simple go to the server configuration page for a particular OC4J instance and tell the server how many JVMs to run like the picture below:

A single number and hit the apply button. That's it.

If I like editing XML I can go to <oracle_home>\opmn\conf\opmn.xml and edit the field called num_procs by the OC4J instance I am interested in but doing it from the console or scripting the change from JMX gracefully handles the startup of the changed configuration.

Once you have made your change, in OracleAS 10.1.3.1, you can go to any page and it will tell you how many JVM instances are running per OC4J instance as this picture shows (from the cluster topology page:


Pretty cool. Now in real life you frequently not only want to correctly utilize the CPU resources on your individual machines in the most efficient and operationally simply way possible (this being an example), clearly you also are concerned about availability should disaster strike and you need to fail over gracefully to other machines.

As the above picture shows most people actually run multiple OC4J instances with multiple JVMs and then distribute them across machines. The slightly expanded picture of the above screen below shows how not only can you use multiple JVMs on a single machine but you can pretty quickly and easily create groups of OC4J each with the tailored amount of JVMs spanning machine boundaries and application server instances.


Because I am but a poor man with only one machine I have to simulate 3 application server instances, one for two J2EE servers (soa_j2ee), one for an Apache HTTP server (soa_web) and another for what Oracle calls the SOA Suite (soasuite) containing our BPEL Process Manager, ESB, Rules engine, Web services manager amongst others. It is a contrived example but shows a simplified environment (imagining each of these instances on separate machines) that could deliver HA while utilizing machine resources effectively with the JVM feature.

At the bottom, which I suppose is a topic for another writeup, is the grouping feature which lets you group OC4J instances into a logical group that can span machines and OracleAS instances. Ironically, despite the seeming complexity of this setup, I would estimate it took me about 40 minutes to set it up - 2o minutes of which was running the one button click install while the bits were laid down on my machine.

From what has always been available as a simple 70 megabyte download - OC4J Stand-alone - (http://www.oracle.com/technology/tech/java/oc4j/10131/index.html) designed for developers (though many use it in production deployments because it is so lightweight and easy to use - corresponding to a single JVM in this writeup) to the full fledged managed version I am running here - Oracle Application Server - (http://www.oracle.com/technology/software/products/ias/soapreview.html) , this is not a bad story that I don't think is told very often or understood but at least a stab at it has been taken here :-)

6 comments:

Unknown said...

It seems that you were using AS 10.1.3.1 version. Does multiple JVM work in AS 10.1.3.0? How does the opmn.xml should look? For example, if I have the following configuration:

port id="default-web-site" range="8890" protocol="http"
port id="rmi" range="12401-12500"
port id="jms" range="12601-12700"
process-set id="default_group" numprocs="2"

and getting the following error

ias-component/process-type/process-set:
OC4J/home/default_group/

Error
--> Process (index=1,uid=931202274,pid=0)
no port available from the port range
failed to start a managed process after the maximum retry limit
no port available from the port range
no port available from the port range
Log:
none
Thanks.

Mustafa

Mike Lehmann said...

Unfortunately multi-jvm is not supported in OracleAS 10.1.3.0. You have correctly edited the opmn.xml but in 10.1.3.0 there are a variety of race conditions that can result so as a result we said "not supported" and fixed the issues in 10.1.3.1. Remember that 10.1.3.1 is not only a full release but a patchset - you can apply the patchset to a 10.1.3.0 install and then have all the functionality of 10.1.3.1 - including multi-jvm support.

Jaikiran said...

Interesting topic, Mike. I wasnt aware that any application server provides support for running multiple JVMs for a single instance of the app server.

Talking from an application, thats deployed on the Oracle instance having multiple JVMs, perspective what additional factors should the application care about? I mean since multiple JVMs are handling the request from the client to the server, how would the application ensure that the correct JVM is processing the request and not any arbitrary JVM. More specifically, how would the java objects of the application be handled since each JVM would be having its own instance of the objects.

Mike Lehmann said...

Normally the issue customers have are:

1. How many JVMs to allocate. This can be a quick and easy way to manage things like GC - frequently adding another JVM can reduce GC's depending on your application's performance characteristics - or taking advantage of additional processing power on the machine to scale out your throughput.

2. Your question - how does the app ensure it is operating on the right JVM. That is the job of the OPMN process coordinating this working in conjunction with mod_oc4j - our router from the HTTP layer. Typically for Web applications you will see sticky requests get routed to the same JVM (e.g. if you are carrying a session) however if it is stateless, mod_oc4j has a variety of load balancing capabilities - see: http://download-west.oracle.com/docs/cd/B32110_01/web.1013/b28948/load.htm#CIHEABJD

but remember it is load balancing over the OC4J instances - each JVM you allocate is effectively an OC4J instance.

If you need object sharing - e.g. state replication - then we take care of that out of box. If you need something more sophisticated - a shared service - something like the recent announcement of Tangosol and Oracle getting together makes for a pretty effective solution :-)

Jaikiran said...

Thanks Mike for the explanation.
You mentioned:

but remember it is load balancing over the OC4J instances - each JVM you allocate is effectively an OC4J instance.


I believe Oracle AS supports clustering (multiple server instances). How is this different from the multiple JVM approach? By going for the multiple JVM option, are we just avoiding the overhead of having multiple installations of the server

Mike Lehmann said...

Clustering is an overloaded term in OracleAS. There is several layers that are supported:

1. From the Oracle HTTP Server to OC4J instances via mod_oc4j. This is load balancing across an arbitrary number of OC4J's (inside or outside of a specific Oracle Home)

2. Multiple OC4J instances - either in separate Oracle homes, differently named OC4J instances in the same Oracle home or finally multiple OC4J instances using the num_procs

3. All variations of the above can have lifecycle managment (start/stop,monitor) as well as deployment and resource configuration. Sometimes people mean this when they say cluster support.

A good starting point to cover this and more (e.g. including load balancers, Web Cache and the layers to the database for things like load balancing and fast connection failover for features specific to Oracle like Oracle RAC)

http://download-west.oracle.com/docs/cd/B31017_01/core.1013/b28941/aa.htm#i1018957

Mike.