Thursday, February 11, 2010

BPEL process Stuck in manual recovery

You are using a BPEL process and BPEL Instances are getting stuck in Recovery for hours and do not get processed. Even after doing the manual recovery , instances do not get processed.

The root cause of DeadLock is, connection is not getting released back to OC4J private connection pool . This happens because by default cacheConnections is set to true, and when cacheConnections is set to true, JCAConnectionPool keeps OC4J connection in check out state(because connection is cached) OC4J does not put back connection to OC4J connection pool. This is done so that later on if some new connection is requested from JCAConnectionPool, we do not need to checkout that connection from OC4J pool and could return that existing connection

It also depends upon the schema you are using at your end if you are using dynamic schema this is not an issue but if you are using a fixed_wait schema this will cause the threads to lock because the connection will wait before coming back to pool and if during that time a new request for a connection pool will come it won't be available.So in order to overcome this issue.YOu need to do the following settings in your bpel.xml

Switch off cacheConnections by adding
cacheConnections=false in bpel.xml for partnerlink

so now your BPEL.xml should have entry like

<partnerLinkBinding name="adapter">
<property name="wsdlLocation">adapter.wsdl</property>
<property name="retryInterval">60</property>
<property name="cacheConnections">false</property>
</partnerLinkBinding>


Ideally a connection pool should be defined like this

data-source.xml

<connection-pool name="TEST" min-connections='2'
max-connections='20'
initial-limit='0'
used-connection-wait-timeout='60'
inactivity-timeout='60'
connection-retry-interval='30 '
max-connect-attempts='10'
validate-connection='false'
num-cached-statements='0'
time-to-live-timeout='-1'
abandoned-connection-timeout='-1'
property-check-interval='900'>

<connection-factory ..............>

</connection-pool>

for better performance the min-connection should not be zero this is because of it is non zero there will be some connection available so when a new request will come there is not need to create a new connection from begining one can use directly the default connection in the connection pool.

We must specify the max connection value to limit the maximum number of connections

initial-limit-It is the Initial Size of Connection Cache
When a session starts, if initial-limit is set to a value greater than max-connections (eg- initial-limit=10 and max-connections=5), then only the max-connections will be initialized.

When a session starts, if initial-limit is set to a value less than min-connections, (eg- initial-limit=10 and min-connections=15), then only the initial-limit will be initialized. Later on, when more connections will be called, the min-connections number of connections will be there in the connection pool.

used-connection-wait-timeout
It is the time the server will wait in seconds for a used connection to be released by a client.


inactivity-timeout
As clear from the name it is the amount of time server will wait in seconds for an unused connection to remain inactive before it is removed from the pool.


connection-retry-interval
It is the amount of time a server waits in seconds before retrying a failed connection attempt.

max-connect-attempts
the number of time a server will try to connect

validate-connection
If it is set to true when ever a connection will be taken from connection-pool it will be validate against the db.It will cause a performance delay so should be kept false.All the value i have specified above is an ideal value.

num-cached-statements
It is the maximum number of SQL statements that should be cached for each connection. Any value greater than 0 automatically enables statement caching for the data source.It is again a performance delay so should be set to zero

time-to-live-timeout
It is the maximum active Time for a Used Connection
Default = -1 means that the feature is not enabled.

abandoned-connection-timeout
The amount of time to wait in seconds that an unused logical connection may be inactive before it is removed from the pool.
Default = -1 means that the feature is not enabled.

property-check-interval
IT is used with Oracle databases only.It is the time interval in seconds for the cache daemon thread to enforce the time out limits.The default value is 900


You also need to configure your oc4j-ra.xml properly .I am specifying on the important properties that you need to specify


<connector-factory location="eis/adapter" connector-name="AnyAdapter">
<config-property name="connectionFactoryLocation" value="Specify connection Factory location"/>
..................................
<property name="waitTimeout" value="180" />
<property name="scheme" value="fixed_wait" />
<property name="property-check-interval" value="5" />
<property name="inactivity-timeout-check" value="all" />
<property name="abandoned-connection-timeout" value="20" />
<property name="autoCloseSession" value="" />
<property name="inactivity-timeout" value="30" />
<property name="maxConnections" value="50" />
<property name="minConnections" value="0" />
</connection-pooling>
<security-config use="none">
</security-config>
</connector-factory>

If your adapter has been configured including all these parameters and specifying correct valus then you should not get an issue most of the time.however the setting varies depending on the adapters but for most of the adapters these settings are valid.

No comments: