For more years than I care to mention, I’ve been tinkering with bespoke DB2 monitoring. Wiser heads have always come back with the (not unreasonable) call that this is bonkers as:
a) We already have a monitor
b) Who’s going to maintain mine if I fail the bus test
So, just when I’m beginning to look like a proper loon, IBM pops up with APAR PM90886 (DB2 V10) and PM96478 (DB2 V11) and introduces some new fields to IFCID 3. These three fields contain the savings that DB2 thinks you could have made if you’d just run that monster query in an accelerator.
- V10 : https://www-01.ibm.com/support/docview.wss?uid=swg1PM90886
- V11 : https://www-01.ibm.com/support/docview.wss?uid=swg1PM96478
The nice thing about this APAR is it lets you get at the information straight away – even if you don’t already have an accelerator. What it’s going to do is try and estimate how much elapsed, CPU and zIIP time you could have saved if the data was in an attached and configured accelerator and the query was handed off to it.
This is quite neat, and it would be really useful to be able to get at that data for simple cost justification and to help with application / accelerator planning.
As a proof of concept, I used the information from the previous posts (some replicated here, others still loafing around on Blogger – cf https://db2dinosaur.blogspot.co.uk/) to create some assembler that :
- Connects to DB2
- Establishes WBUF setting buffer trigger threshold at 1 byte
- Starts a trace (-START TRACE(ACCTG) CLASS(1,2,3,7,8) DEST(OPN))
- Wait for the buffer trigger to post, then
- READA the buffer
- Format the data
- Reset ECBs, and loop back to the wait
The data was filtered so that we were just looking at IFCID 3, then formatted by passing the returned record to some Rexx. I know – it’s not quick and it uses plenty of CPU, but this is a proof of concept and the nice thing about Rexx is it’s quick to get something working.
The following is some output that was tracking this query running against a table with one million rows of random data and no indexes that would help:
SELECT AVG(R2) AS AVGR2, STDDEV(R2) AS SDR2, R1 FROM DINO.TEST1 GROUP BY R1 ORDER BY R1
(R1 = SMALLINT, R2 = INTEGER)
The output has been snipped to prevent the excellent collection of zeros that make up most of the report from making the rest of this a boring read:
----------------------------Instrumentation Section------------ Begin time : 2014-02-10-09:50:21.269387 End time : 2014-02-10-09:52:31.401876 Begin CPU : 0.165366 End CPU : 124.803044 Reason inv : 0000000C Deallocation - normal program termination Network ID : # commit : 1 # abort : 0 C1 elapsed : 117.547220 C1 CPU : 111.901916 (not zIIP or SP CPU time) I/O wait : 0.012344 Lcl lock wait : 0.017767 # : 2 Async read IO : 0.002903 # : 4 Latch wait : 0.000165 # : 4 : <snip> : -- ACCELMODEL=YES -- Eligible elap : 116.612038 Eligible CPU : 111.844109 Eligible zIIP : 0.000000
Looks like a good candidate for acceleration. I wonder if we can get one of those plugged in to our zPDT?