Tuning YTD-style calculations

This subject came up in a recent thread I was involved in on microsoft.public.sqlserver.olap, and in my experience poorly-performing queries using YTD-style calculations are a common problem. When you think about it the reason why is fairly clear: retrieving and then summing up up to 12 months of data is always going to take longer than retrieving one month of data alone, especially if the summing up ends up happening on the client rather than the server. When you are working with large resultsets or have other calculations in play then the effect often seems much worse; the fact that there may not be any aggregations designed at the month level (often the bottom level of the Time dimension) in the cube just compounds this problem.

So what can be done to tune these kinds of queries, apart from the usual stuff like partitioning? I have two main recommendations:

1) Make sure you have aggregations built at the month level. Setting the Aggregation Usage property on a level (select the dimension in the Cube Editor and then look under the advanced tab) will give some guidance to Analysis Services about where you want your aggregations built, but there is no guarantee if you set it to ‘Bottom Level Only’ aggregations will get built at the bottom level – they might fall foul of the 1/3 rule (explained in detail in the Analysis Services Performance Guide) for example.

Unfortunately Analysis Manager won’t tell you what aggregations have been built when you go through its wizards, so you’ll need to use something like the Partition Manager tool (available as part of the SQL 2K Resource Kit or the BI Accelerator) to see this. If all of the aggregations listed are above the month level but all of your queries are at the month level, then your aggregations are simply not being used! If you can’t persuade Analysis Manager to build aggregations where you need them by setting the Aggregation Usage property and changing level member counts, then as a last resort try creating the aggregations manually using Partition Manager. Knowing which aggregations to build, though, when you’re building them manually can be very difficult and I’ll try to tackle this subject in a future blog entry.

One last thing to note here is that if your queries are at the leaf levels of almost all of your dimensions, then building aggregations is unlikely to help query performance much and will probably slow down cube processing a lot. If so, then my second recommendation might be more useful…

2) Use aggregations at higher levels on your Time dimension to your advantage. Consider the situation where you’re doing a YTD calculation for the month of December – if all your measure is additive, then the result should be the same as the value at the Year level. Similarly, when you’re doing a YTD calculation for the month of July – as well as the sum of seven months, it could instead be expressed as the sum of quarter 1, quarter 2 and the month of July, a sum of three values instead of seven. In a cube with no aggregations then then getting the value at the Year level and summing up two quarters and a month would probably perform about the same as summing up the respective months, but if aggregations are present at the Year and Quarter then if you can help Analysis Services avoid a lot of summing up at query time by getting your YTD calculation to leverage these presummarised totals.

The real problem here is how to write the MDX for your YTD calculation to do this, and it’s not easy. Here is a worked example on Foodmart 2000 showing how you can change a set containing months such as {[Time].[1997].[Q1].[1]:[Time].[1998].[Q2].[5]} that the YTD function would return into a set where as many months as possible are replaced by their common ancestors, so {[Time].[1997], [Time].[1998].[Q1], [Time].[1998].[Q2].[4], [Time].[1998].[Q2].[5]}:

WITH
/*The initial time range expressed in months*/
SET MYRANGE AS ‘{[Time].[1997].[Q1].[1]:[Time].[1998].[Q2].[5]}’
/*Find out which whole years are present*/
SET YEARSPRESENT AS ‘
/*Loop over all Years from the ancestor of the first month in the range to
the ancestor of the last month*/
GENERATE(
{ANCESTOR(MYRANGE.ITEM(0), [Time].[Year]) : ANCESTOR(TAIL(MYRANGE,
1).ITEM(0), [Time].[Year])}
/*Need to use nasty string functions to get IIF to return members or sets.
Yukon does not have this restriction!*/
, STRTOSET(
/*If all of the descendants at month level of the current year appear in the
range, we can replace them with the year member*/
IIF(COUNT(INTERSECT(MYRANGE, DESCENDANTS(TIME.CURRENTMEMBER,
[Time].[Month]))) = COUNT(DESCENDANTS(TIME.CURRENTMEMBER, [Time].[Month]))
, "{" + MEMBERTOSTR(TIME.CURRENTMEMBER) + "}"
, "{}")
)
)’
/*As an intermediate step, remove all the months from the initial range
which will be replaced by whole years*/
SET MONTHSLEFT AS ‘EXCEPT(MYRANGE, DESCENDANTS(YEARSPRESENT, 
[Time].[Month]))’
/*Now go through all the remaining months and see which ones have whole
quarters present, using the same logic as above*/
SET QUARTERSPRESENT AS ‘GENERATE(
{ANCESTOR(MONTHSLEFT.ITEM(0), [Time].[Quarter]) : ANCESTOR(TAIL(MONTHSLEFT,
1).ITEM(0), [Time].[Quarter])}
, STRTOSET(
IIF(COUNT(INTERSECT(MONTHSLEFT, DESCENDANTS(TIME.CURRENTMEMBER,
[Time].[Month]))) = COUNT(DESCENDANTS(TIME.CURRENTMEMBER, [Time].[Month]))
, "{" + MEMBERTOSTR(TIME.CURRENTMEMBER) + "}"
, "{}")
)
)’
/*Finally return the new range, with years and quarters substituted for
months where possible*/
SET MYNEWRANGE AS ‘{YEARSPRESENT, QUARTERSPRESENT, EXCEPT(MYRANGE,
DESCENDANTS({YEARSPRESENT, QUARTERSPRESENT},  [Time].[Month]))}’
/*Helper calculated members to display the steps*/
MEMBER MEASURES.STEP1 AS ‘SETTOSTR(MYRANGE)’
MEMBER MEASURES.STEP2 AS ‘SETTOSTR(YEARSPRESENT)’
MEMBER MEASURES.STEP3 AS ‘SETTOSTR(MONTHSLEFT)’
MEMBER MEASURES.STEP4 AS ‘SETTOSTR(QUARTERSPRESENT)’
MEMBER MEASURES.STEP5 AS ‘SETTOSTR(MYNEWRANGE)’
/*Show working*/
SELECT {MEASURES.STEP1, MEASURES.STEP2, MEASURES.STEP3, MEASURES.STEP4,
MEASURES.STEP5} ON 0
FROM SALES

From the response I got on the thread listed above where I first suggested this technique, if you have the righ aggregations in place this can make a big difference to performance.

 

2 thoughts on “Tuning YTD-style calculations

  1. Hi Chris,I\’ve responded on the OLAP newsgroup, as this blog interface seems to limit the length of comment posted(though it doesn\’t tell me what the limit is). – Deepak

Leave a Reply to DeepakCancel reply