Referencing Named Sets in Calculations

I was recently involved in an interesting discussion about the negative performance impact of referencing named sets inside calculated members. It’s an issue that’s dealt with in this topic in BOL, along with lots of other useful tips for things to avoid when writing MDX calculations:
http://msdn.microsoft.com/en-us/library/bb934106.aspx

Since I see lots of people making this mistake, though, I thought it was nonetheless worth a blog post; it’s certainly very easy to reproduce in Adventure Works. Take the following set of calculations:

CREATE SET ALLCUSTS AS [Customer].[Customer].[Customer].MEMBERS;

CREATE MEMBER CURRENTCUBE.MEASURES.TEST1 AS 
COUNT(
NONEMPTY(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount])
);

CREATE MEMBER CURRENTCUBE.MEASURES.TEST2 AS 
COUNT(
NONEMPTY(
ALLCUSTS
, [Measures].[Internet Sales Amount])
);

CREATE MEMBER CURRENTCUBE.MEASURES.TEST3 AS 
SUM(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount]);

CREATE MEMBER CURRENTCUBE.MEASURES.TEST4 AS 
SUM(
ALLCUSTS
, [Measures].[Internet Sales Amount]);

 

You’ll notice that TEST1 and TEST2 are essentially the same calculation, as are TEST3 and TEST4; the only difference between them is that the set expressions in TEST1 and TEST3 have been replaced by references to the named set ALLCUSTS in TEST2 and TEST4.

Now run the following query four times on a cold cache, each time putting a different calculated measure from the list above in the WHERE clause:

SELECT [Date].[Calendar Year].MEMBERS ON 0,
[Product].[Product].MEMBERS ON 1
FROM [Adventure Works]
WHERE(MEASURES.TEST1)
 

On my machine the query with TEST1 took 874ms to run; the query with TEST2 took 6302ms; the query with TEST3 took 234ms; and the query with TEST4 I ended up killing after a few minutes.

So, clearly, as the article says referencing a named set inside one of the MDX aggregation functions in a calculation is a Very Bad Thing for performance and something to be avoided at all costs. While it might seem an appealing thing to do for readability, the downsides are significant.

13 thoughts on “Referencing Named Sets in Calculations

  1. And the real problem is that dynamic sets (which are the only generally working solution to the multiselect problems) suffer from the performance penalty.
    Honestly saying, I don’t quite understand where SSAS takes the wrong path. MDX Studio, for instance, provides the same statistics for [Customer].[Customer].[Customer].MEMBERS, SET and DYNAMIC SET (except the time of course):

    Cold cache execution
    Time : 11 sec 802 ms
    Calc covers : 0
    Cells calculated : 2388
    Sonar subcubes : 2404
    NON EMPTYs : 2394
    Autoexists : 0
    EXISTINGs : 0
    SE queries : 2398
    Flat cache insert : 1
    Cache hits : 2399
    Cache misses : 4
    Cache inserts : 3
    Cache lookups : 2403
    Memory Usage KB : 29612

    Warm cache execution

    Time : 10 sec 270 ms
    Calc covers : 0
    Cells calculated : 2388
    Sonar subcubes : 2402
    NON EMPTYs : 2394
    Autoexists : 0
    EXISTINGs : 0
    SE queries : 2394
    Flat cache insert : 1
    Cache hits : 2398
    Cache misses : 0
    Cache inserts : 0
    Cache lookups : 2398
    Memory Usage KB : -1928

    Looks more like a documented bug (i.e. a feature 😉 ) for me

    1. Dynamic sets are the subject of a forthcoming post… but yes, it’s crazy that one of the few solutions to the multiselect problem suffer from performance issues.

  2. Hi Chris how about referencing named members within a scope statement? it seems if I use the following create member statement my query will time out.

    CREATE MEMBER CURRENTCUBE.[LastDayMember] AS
    TAIL(
    NONEMPTY(
    EXISTS([Date].[Calendar Date].[Date].MEMBERS,
    [Date].[Is Working Day].&[1]),
    [Measures].[X])
    ,1).ITEM(0)
    ;

    If I explicitly give the member a value such as
    [Date].[Calendar Date].[Date].&[20120928]
    The query runs in 7 seconds which I find strange because the above member statement runs extremely quick standalone.

    1. Hi John,

      No, this post is nothing to do with your problem – you’re not using named sets. You’ve just got a badly performing calculation! I’ll reply to your email re NUMA soon – I’ve been pretty busy for the last few days…

      Chris

  3. Hi Chris,
    Good to hear from you :-). Yes its a create member statement member however from my diagnosis it suggests that its not being cached or just not liked being referenced? There is more details on the forum post
    http://social.msdn.microsoft.com/Forums/en-US/sqlanalysisservices/thread/b39eb832-dadf-4e77-9bee-7c25009337ae
    I found other more complicated scenarios where I don’t reference a named member I get excellent performance
    John

  4. hi… can you tell me whether I can use the dynamic sets in a scope statement. Currently I have a bulk of scope statements using a static set. But when I convert them to dynamic the processing is failed.

    Is there any alternative for this.

    1. Scope statements are only evaluated once, when the MDX Script is processed, so whatever you were trying to do by using dynamic sets in them it wouldn’t have worked unfortunately.

  5. Hi Chris,

    Thanks for the post, and the many others you have provided.

    I am fairly new to MDX and having to learn my way as I go. Without your posts and the posts of other generous bloggers I would be getting nowhere fast!

    Is the performance issue that is observed with named sets replicated when using calculated measures within other calculated measures or is this specific to named sets?

    I have used calculated measures inside other calculated measures in order to obtain a running total against a non-time based dimension. In order to do so I have used the Sum function, the Rank function and the Head function. However, my MDX query takes some time to run. I wondered whether this is because of using calculated measure names within other calculated measures?

    Many thanks in advance for any advice you are able to provide.

    Carl

    1. Hi Carl,

      This is specific to named sets, and it has been ‘fixed’ in SSAS 2012 too, so this post is only relevant to 2008R2 and earlier versions.

      Your performance problem is more likely to be related to the functions you’re using in your MDX, or the way you’re writing the calculation. Drop me an email offline (my contact details can be found at http://www.crossjoin.co.uk) and I can take a look at your code if you want.

      Regards,

      Chris

  6. Thanks for writing this up.
    Is this fixed in latest update SQLServer 2008 R2 SP2 Cu6 ? We use this in lot many places and also client tool used do fire aggregate over namedSet.

  7. Hi Chris, thanks for this good post.

    Working in a cube with financial data, I was just about to define some named sets containing members from the Account dimension. I thought this would make sense since it would make the code look clearer in the calculated measures I am about to define (which would reference the named sets). I was also thinking that this was a good practice and maybe would speed things up as well!

    But as I understand it from your post, you do not recommend doing this? Performance wise it is better to not use named sets but instead define the sets within the calcuated measures themselves?

    I am a beginner in MDX and working in SSAS 2008R2. Do you happen to have a link with information how this issue has been “fixed” in the SSAS 2012 version?

    Martin

Leave a Reply to John ShiangoliCancel reply