A DB2 pureScale Primer
DB2 pureScale adopts many of the same concepts and terminology as the well-established DB2 for z/OS Data Sharing technology, usually considered to be the “gold standard” for shared-disk database architectures. Multiple DB2 instances, or “members” accept and service incoming DB2 work, with all of them accessing a single copy of the data (usually held on a shared, high-performance, fault tolerant disk subsystem).
So, how do you stop multiple processes all updating the same data at the same time? That’s where the clever technology known as the “coupling facility” (or CF) comes in. The CF is responsible for co-ordinating the activities of all of the DB2 members in the pureScale group, and takes the form of a dedicated unit officially called a “PowerHa pureScale Server” (now you know why I’m calling it a CF…).
The CF holds shared locking information and cached data pages of interest to one or more members of the group. Each member has direct access to the CF via an InfiniBand high-speed network interconnect, minimizing the performance overhead.
One of the design goals for pureScale was to minimize the impact to the applications running in the cluster, and although there may be some need to make minor changes to eke out the very best performance, it is perfectly possible for an application to run on a pureScale cluster without making any changes whatsoever.
Workload balancing facilities are provided, to allow work to be intelligently distributed to various member DB2 systems based on how busy each one is. Again, for most applications no changes will be required to take advantage of this.
How does all of this relate to the resilience and scalability requirements we discussed earlier? Well hopefully the scalability one is fairly obvious; a pureScale user that needs to expand the available computing resource in order to handle higher workload volumes can simply add additional members to the group. IBM has used the hard-won experience of DB2 for z/OS Data Sharing to build in lots of optimizations to minimize the sharing overheads, providing excellent scalability. In lab tests, that scalability has been impressively close to being linear (e.g. doubling the number of members almost doubles the available capacity) but as always your mileage may vary. New capacity-based charging models are also being introduced, allowing users to rapidly scale up and scale down their available resource in a very cost-effective manner.
The resilience angle becomes apparent when you realize that it’s possible to run two CFs in a duplexed arrangement, with DB2 automatically keeping primary and secondary CF in sync. So, with dual CFs and multiple DB2 members all hosted in separate physical boxes and a fault-tolerant disk subsystem, there’s no single point of failure – losing a member, a CF or a physical disk still allows processing to continue (albeit at a potentially slower pace due to each surviving server having to shoulder more of the processing load). This is therefore a true “active/active” clustering solution.
It’s important to understand that the performance of a pureScale cluster is critically dependent upon the speed of the interconnects between the various members and the CF. Therefore, this technology is suitable for providing local resilience only (i.e. within the same machine room). If you also require an off-site DR capability (just in case the apocryphal Jumbo jet really does decide to use your server room as an emergency landing strip), you’ll need to combine pureScale with other solutions such as HADR, which are capable of operating over greater distances due to their asynchronous nature.
DB2 pureScale links:
Download the DB2 pureScale podcast
Read our DB2 pureScale blogs – Five days in the labs