When working with a customer a couple of years ago, we observed what appeared to be an odd situation, where we were well short of the DDF CONDBAT limit, but new connections to DB2 for z/OS were failing. What could be causing this?
What do you mean by…..?
Some terminology, just to start off with so that we’re all talking about the same thing:
- Connection. This is a TCP/IP thing and can be best thought of as the established pipe between the client application / machine and DB2.
- Database Access Thread (DBAT). This is the SQL processing engine, working as the allied connection owner in z/OS in the DB2 xxxxDIST address space.
When DB2 is running normally, a distributed client will connect to DB2, establishing a “connection”, and then execute some SQL, acquiring and using a DBAT. The DBAT is released by the client with its other resources (RELEASE(COMMIT) / RELEASE(DEALLOCATE)).
Figure 1 – Connections and DBATs
As shown in figure 1, the limits set in ZPARMs for these entities are CONDBAT to limit the number of connections (external pipe contacts) and MAXDBAT to limit the number of distributed work engines.
One of the challenges that has to be faced is, what do you want to happen when all of the DBATs are in use and another DDF connection arrives? It maybe that DBAT queuing is unacceptable, and you’d rather return an error condition to your application than wait for a DBAT to become available. This can be achieved by setting CONDBAT = MAXDBAT. In this case, when you run out of DBATs, you’ve also run out of TCP/IP pipes – connections – and 00D31034 results.
Most customers would rather accept a little queuing than fail the transaction, so tend to set CONDBAT > MAXDBAT. But what level of queuing results?
There are two answers to this:
Before DB2 V10 and PM43293
There was previously a limit on the maximum number of connections that DB2 would allow to be queuing for a DBAT – and this was set to MAXDBAT as well:
Figure 2 – DBAT Queuing Before PM43293
As shown in Figure 2, above, the maximum number of connections that can be established is CONDBAT. Of that number, MAXDBAT can be doing active work using a DBAT, and a further MAXDBAT can be connected and waiting to acquire a DBAT to perform work. Any remaining connections will be inactive (not queued for a DBAT).
A connection queuing for a DBAT will wait forever.
Note that both the queue depth and the (unending) timeout are undocumented until PM43293.
With DB2 V10 and PM 43293 and Up
Two new ZPARMs were introduced with PM43293 in V10 and are part of base V11 onwards:
- MAXCONQN – the maximum number of connections waiting for a DBAT beyond MAXDBAT, i.e. the length of the queue for a DBAT once MAXDBAT has been reached.
- MAXCONQW – the maximum wait time queuing for a DBAT.
Figure 3 – DBAT Queuing With PM43293
As well as absolute values, there are special values available for these new parameters as well:
- MAXCONQN = OFF : The DBAT queue length = CONDBAT – i.e. you will hit maximum number of connections (CONDBAT) before you hit maximum number of connections queued for a DBAT – see Figure 4, below.
- MAXCONQN = ON : The DBAT queue length = MAXDBAT (old behaviour)
- MAXCONQW = OFF : Queue forever for a DBAT (old behaviour)
- MAXCONQW = ON : Maximum queue time = IDTHTOIN
Figure 4 – DBAT Queuing With PM43293 and MAXCONQN = OFF
Conclusion
Whilst these changes have been around for a while now (18 months), a lot of sites are still using the default behaviour. It should be noted that whilst this means that behaviour from previous releases is unchanged, there is an opportunity to improve application availability by working with these two new ZPARMs.
As with all ZPARM update, monitoring behaviour before and after making the changes is crucial for successful implementation.