This is something I have wanted to do for a long time, so here goes.
Today I will write about the Oracle assignment that I am busy with – trying to speed up an Oracle data warehouse. The most important lesson learned during the assignment is to balance your Oracle investment. You can think of your warehouse as a data supply chain or a data pipeline. If what your CPUs, RAM and IO subsystem can handle in terms of continuous, sustainable throughput that matches the demand that users put on the system, then you will either not meet business expectations or be wasting money on idle resources.
In this particular instance, for example, a single 2GB/s HBA has to handle all the traffic for 4 different Oracle databases. The machine has 4 CPU cores, of which two is available for Oracle. This being a bit dodgy in it self, still requires about 400 MB/s throughput capability.
The other main problem that was identified was that there was not enough RAM for the large hash joins that were frequently being done due to data analysts doing ad hoc queries reading up to 100 percent of large, multi Gigabyte tables.
In particular, one of-the-shelf marketing automation application experienced debilitating performance problems. Since we could not tune the application and getting HBAs swopped in a shared environment within a complex, large organisation with multiple outsourcing partners is not something that happens overnight, we had to do some short term solution type stuff.
More PGA RAM (much faster hash joins) and data compression (faster full table scans) saved the day.
Signing off
Herman Scheepers