OLTP and BI mixed workloads – a merry coexistence in DB2 10.5 “Cancun Release”
If you have been using DB2 10.5 with a mix of row and column based tables in the same database you may have experienced slowdown in some of your OLTP queries accessing your row based tables after implementing column based tables. We had this issue with one of our customers after implementing columnar tables. In one scenario this happened due to an access plan change where the optimizer chose a more expensive HSJOIN after the SORTHEAP was increased.
A quick background: columnar tables need a very high allocation of database level sort memory, (allocated using the sortheap and sheapthres_shr database parameters) for their processing. However, changing sort memory to a very high value can have an adverse effect on some OLTP queries, depending on predicate and statistical distribution. With a small SORTHEAP value, the optimizer correctly chooses NLJOIN (nested loop join) or MSJOIN (merge sort join) because HSJOIN (hash sort join) is estimated to have significant sort spilling. But with a very high SORTHEAP value, the optimizer may think there is no spilling and hence wrongly decides HSJOIN is cheaper, causing the OLTP queries to take almost twice as much time to execute. We noticed in some of our queries that access plans were using MSJOIN and index scan before the SORTHEAP was increased. After the SORTHEAP increase, the access plans got changed and the optimizer decided to use HSJOIN and a table scan.
This meant that in order to maintain the SLAs for the OLTP queries, we would have to place the columnar tables in a separate database. Not ideal.
Now, enter the new DB2 10.5 FP4 “Cancun Release” and we have a solution! Using the DB2 registry variable DB2_EXTENDED_OPTIMIZATION, we can specify a lower SORTHEAP value (the original SORTHEAP value before it was increased to accommodate columnar tables) and force the optimizer to use that value for OLTP queries accessing row-based tables.
db2set DB2_EXTENDED_OPTIMIZATION=”OPT_SORTHEAP_EXCEPT_COL 1024″
If the query is not using any column organized objects then the optimizer can decide to ignore the database level large SORTHEAP value and to use instead the new value specified in this registry setting (1024 in this example).
After setting the DB2_EXTENDED_OPTIMIZATION registry variable in our database, we noticed the access plan for the OLTP query was back to how it was originally. MSJOIN and index scan was used and the response time was back to normal.
This now means row and column based tables can co-exist in the same database without OLTP queries on row tables being adversely affected.
This is also really useful for customers who have a need to carry our reporting on their transactional databases. They can use shadow tables (see this blog) which are columnar, without worrying about the adverse impact the large sort memory will have on their OLTP queries accessing row based tables.
Authors: Somu Chakrabarty and Iqbal Goralwalla