Building aggregations in your SSAS Multidimensional will make your queries faster, right? While that’s true, they will only make a noticeable difference to performance if your query has Storage Engine-related problems rather than Formula Engine-related problems. What’s more, even when do you have Storage Engine-related problems there are some cases where you may find that aggregations don’t give you the kind of performance boost that you expect. In this post I’ll explain why this can happen, how you can use the Resource Usage Profiler event (as described in parts 1 and 2 of this series) to find out when this is happening, and how you can deal with the problem.
Aggregations make queries go faster for two reasons:
- Most importantly they contain pre-aggregated data. For example your fact table might contain data at the Day granularity but an aggregation might contain fact data aggregated up to the Year granularity. This means that when a query needs to get data at the Year granularity, SSAS does not need to read the fact table-level data stored in the partition, it can read the data it needs direct from the aggregation without needing to aggregate any data at query time.
- Secondly, because the size of an aggregation is usually a lot smaller than the size of the fact table-level data stored in a partition, it is much faster to read data from an aggregation.
However, regarding this second point, there’s a catch: you’ll know if you’ve read the previous posts in this series that SSAS builds indexes on fact data so it can scan that data very quickly, but most builds of SSAS do not build indexes on aggregations. I say ‘most’ because there were a few builds of SSAS that did build indexes on aggregations, but as this article explains this feature was turned off soon after it was introduced because it was found that the time spent building those aggregations was not usually worth any gain in query performance that resulted.
The Resource Usage Profiler event can be used to monitor the number of rows read and rows scanned when SSAS reads data from an aggregation.Taking the original test query and cube from the previous posts in this series:
[sourcecode language=”text” padlinenumbers=”true”]
…the Resource Usage event shows that when run on a cold cache and no aggregations are built on the cube, SSAS scans 256 out of 5000 fact rows of data – a single page – and returns one row:
However if this cube has an aggregation built on the ID attribute of the ID dimension, the same granularity as the fact table:
…when the query runs and SSAS reads data from this aggregation rather than the fact data, the Resource Usage event returns the following:
Notice that the ROWS_SCANNED value is now 5000. This is because the aggregation has no indexes built on it so SSAS has to scan all the rows in the aggregation. Reading data from an aggregation at the same granularity as the fact data is therefore a lot less efficient than reading data from the original fact data.
Of course, since most aggregations are usually much smaller than the original fact data, the lack of indexes is not so important because scanning all the data is going to be very quick anyway. However, on very large cubes you may need to build some very large aggregations and find that even when your queries hit these aggregations, performance is still bad because of this lack of indexes. If you see this happening, and can see the ROWS_SCANNED value in the Resource Usage event reporting very high values, then it might be a good idea to enable the building of indexes on aggregations.
You can do this by changing the values of the AggIndexBuildEnabled and AggIndexBuildThreshold server properties in the msmdsrv.ini file. Setting AggIndexBuildEnabled to 1 allows SSAS to build indexes on aggregations. It’s not necessary to build indexes on all aggregations though: you can specify that only aggregations larger than a certain number of rows have indexes built using the AggIndexBuildThreshold property. The only public documentation for these properties is given in two articles here and here, and I strongly recommend you read these articles so that you understand the implications of doing this on your processing times. You should only consider changing these properties if you are a very experienced SSAS developer and you monitor the effects carefully – I’m not even sure if changing these properties is supported by Microsoft.