I was recently involved in an interesting discussion about the negative performance impact of referencing named sets inside calculated members. It’s an issue that’s dealt with in this topic in BOL, along with lots of other useful tips for things to avoid when writing MDX calculations:
http://msdn.microsoft.com/en-us/library/bb934106.aspx
Since I see lots of people making this mistake, though, I thought it was nonetheless worth a blog post; it’s certainly very easy to reproduce in Adventure Works. Take the following set of calculations:
CREATE SET ALLCUSTS AS [Customer].[Customer].[Customer].MEMBERS; CREATE MEMBER CURRENTCUBE.MEASURES.TEST1 AS COUNT( NONEMPTY( [Customer].[Customer].[Customer].MEMBERS , [Measures].[Internet Sales Amount]) ); CREATE MEMBER CURRENTCUBE.MEASURES.TEST2 AS COUNT( NONEMPTY( ALLCUSTS , [Measures].[Internet Sales Amount]) ); CREATE MEMBER CURRENTCUBE.MEASURES.TEST3 AS SUM( [Customer].[Customer].[Customer].MEMBERS , [Measures].[Internet Sales Amount]); CREATE MEMBER CURRENTCUBE.MEASURES.TEST4 AS SUM( ALLCUSTS , [Measures].[Internet Sales Amount]);
You’ll notice that TEST1 and TEST2 are essentially the same calculation, as are TEST3 and TEST4; the only difference between them is that the set expressions in TEST1 and TEST3 have been replaced by references to the named set ALLCUSTS in TEST2 and TEST4.
Now run the following query four times on a cold cache, each time putting a different calculated measure from the list above in the WHERE clause:
SELECT [Date].[Calendar Year].MEMBERS ON 0, [Product].[Product].MEMBERS ON 1 FROM [Adventure Works] WHERE(MEASURES.TEST1)
On my machine the query with TEST1 took 874ms to run; the query with TEST2 took 6302ms; the query with TEST3 took 234ms; and the query with TEST4 I ended up killing after a few minutes.
So, clearly, as the article says referencing a named set inside one of the MDX aggregation functions in a calculation is a Very Bad Thing for performance and something to be avoided at all costs. While it might seem an appealing thing to do for readability, the downsides are significant.
And the real problem is that dynamic sets (which are the only generally working solution to the multiselect problems) suffer from the performance penalty.
Honestly saying, I don’t quite understand where SSAS takes the wrong path. MDX Studio, for instance, provides the same statistics for [Customer].[Customer].[Customer].MEMBERS, SET and DYNAMIC SET (except the time of course):
Cold cache execution
Time : 11 sec 802 ms
Calc covers : 0
Cells calculated : 2388
Sonar subcubes : 2404
NON EMPTYs : 2394
Autoexists : 0
EXISTINGs : 0
SE queries : 2398
Flat cache insert : 1
Cache hits : 2399
Cache misses : 4
Cache inserts : 3
Cache lookups : 2403
Memory Usage KB : 29612
Warm cache execution
Time : 10 sec 270 ms
Calc covers : 0
Cells calculated : 2388
Sonar subcubes : 2402
NON EMPTYs : 2394
Autoexists : 0
EXISTINGs : 0
SE queries : 2394
Flat cache insert : 1
Cache hits : 2398
Cache misses : 0
Cache inserts : 0
Cache lookups : 2398
Memory Usage KB : -1928
Looks more like a documented bug (i.e. a feature 😉 ) for me
Dynamic sets are the subject of a forthcoming post… but yes, it’s crazy that one of the few solutions to the multiselect problem suffer from performance issues.
Hi Chris how about referencing named members within a scope statement? it seems if I use the following create member statement my query will time out.
CREATE MEMBER CURRENTCUBE.[LastDayMember] AS
TAIL(
NONEMPTY(
EXISTS([Date].[Calendar Date].[Date].MEMBERS,
[Date].[Is Working Day].&[1]),
[Measures].[X])
,1).ITEM(0)
;
If I explicitly give the member a value such as
[Date].[Calendar Date].[Date].&[20120928]
The query runs in 7 seconds which I find strange because the above member statement runs extremely quick standalone.
Hi John,
No, this post is nothing to do with your problem – you’re not using named sets. You’ve just got a badly performing calculation! I’ll reply to your email re NUMA soon – I’ve been pretty busy for the last few days…
Chris
Hi Chris,
Good to hear from you :-). Yes its a create member statement member however from my diagnosis it suggests that its not being cached or just not liked being referenced? There is more details on the forum post
http://social.msdn.microsoft.com/Forums/en-US/sqlanalysisservices/thread/b39eb832-dadf-4e77-9bee-7c25009337ae
I found other more complicated scenarios where I don’t reference a named member I get excellent performance
John
hi… can you tell me whether I can use the dynamic sets in a scope statement. Currently I have a bulk of scope statements using a static set. But when I convert them to dynamic the processing is failed.
Is there any alternative for this.
Scope statements are only evaluated once, when the MDX Script is processed, so whatever you were trying to do by using dynamic sets in them it wouldn’t have worked unfortunately.
Hi Chris,
Thanks for the post, and the many others you have provided.
I am fairly new to MDX and having to learn my way as I go. Without your posts and the posts of other generous bloggers I would be getting nowhere fast!
Is the performance issue that is observed with named sets replicated when using calculated measures within other calculated measures or is this specific to named sets?
I have used calculated measures inside other calculated measures in order to obtain a running total against a non-time based dimension. In order to do so I have used the Sum function, the Rank function and the Head function. However, my MDX query takes some time to run. I wondered whether this is because of using calculated measure names within other calculated measures?
Many thanks in advance for any advice you are able to provide.
Carl
Hi Carl,
This is specific to named sets, and it has been ‘fixed’ in SSAS 2012 too, so this post is only relevant to 2008R2 and earlier versions.
Your performance problem is more likely to be related to the functions you’re using in your MDX, or the way you’re writing the calculation. Drop me an email offline (my contact details can be found at http://www.crossjoin.co.uk) and I can take a look at your code if you want.
Regards,
Chris
Thanks for writing this up.
Is this fixed in latest update SQLServer 2008 R2 SP2 Cu6 ? We use this in lot many places and also client tool used do fire aggregate over namedSet.
No, I’m pretty sure it isn’t
Hi Chris, thanks for this good post.
Working in a cube with financial data, I was just about to define some named sets containing members from the Account dimension. I thought this would make sense since it would make the code look clearer in the calculated measures I am about to define (which would reference the named sets). I was also thinking that this was a good practice and maybe would speed things up as well!
But as I understand it from your post, you do not recommend doing this? Performance wise it is better to not use named sets but instead define the sets within the calcuated measures themselves?
I am a beginner in MDX and working in SSAS 2008R2. Do you happen to have a link with information how this issue has been “fixed” in the SSAS 2012 version?
Martin
Yes, using named sets in MDX functions like Sum() and Aggregate() is bad for performance before SSAS 2012; in SSAS 2012 this should no longer be a problem, as I blogged here:
http://cwebbbi.wordpress.com/2012/05/16/named-sets-and-block-computation-in-ssas-2012/