Named Sets and Block Computation in SSAS 2012

Greg Galloway (who really should blog more often!) recently came across an otherwise undocumented query performance optimisation in SSAS 2012: using named sets inside aggregate functions in MDX no longer prevents the use of block computation/bulk mode. This was something that was explicitly called out as a Bad Thing To Do in Books Online (the link to the page in question is now dead though, possibly because it’s being updated), but here’s an example of a query that will now run much faster in SSAS 2012 Multidimensional than it used to in R2:

with
set myset as {[Customer].[Customer].[Customer].members}
member measures.demo as
sum(myset,[Measures].[Internet Sales Amount])
select measures.demo on 0,
[Date].[Calendar Year].members on 1
from [Adventure Works]

There are still situations where block computation can’t be used however, namely when the Current() function is used (which are going to be very rare I think):

with
set myset as {[Customer].[Customer].[Customer].members}
member measures.demo as
sum(myset
, iif(myset.current is [Customer].[Customer].&[20075]
, 0, [Measures].[Internet Sales Amount])
)
select measures.demo on 0,
[Date].[Calendar Year].members on 1
from [Adventure Works]

Thanks also to Jeffrey Wang for the background on this.

22 thoughts on “Named Sets and Block Computation in SSAS 2012

  1. Yes, I’ve tested the same query now (on SQL 2012 BI edition) mentioned in your previous post (http://cwebbbi.wordpress.com/2011/03/16/referencing-named-sets-in-calculations/)

    CREATE SET [Adventure Works].ALLCUSTS AS [Customer].[Customer].[Customer].MEMBERS;

    CREATE MEMBER [Adventure Works].MEASURES.TEST1 AS
    COUNT(
    NONEMPTY(
    [Customer].[Customer].[Customer].MEMBERS
    , [Measures].[Internet Sales Amount])
    );

    CREATE MEMBER [Adventure Works].MEASURES.TEST2 AS
    COUNT(
    NONEMPTY(
    ALLCUSTS
    , [Measures].[Internet Sales Amount])
    );

    CREATE MEMBER [Adventure Works].MEASURES.TEST3 AS
    SUM(
    [Customer].[Customer].[Customer].MEMBERS
    , [Measures].[Internet Sales Amount]);

    CREATE MEMBER [Adventure Works].MEASURES.TEST4 AS
    SUM(
    ALLCUSTS
    , [Measures].[Internet Sales Amount]);

    SELECT [Date].[Calendar Year].MEMBERS ON 0,
    [Product].[Product].MEMBERS ON 1
    FROM [Adventure Works]
    WHERE(MEASURES.TEST4) –TEST1; TEST2; TEST3; –TEST 4
    –the results are: 2 sec; 11 sec; 1 sec; 1sec

  2. Hi Chris,

    Question 1: in the May 2013 Sydney training, did you say the SSAS 2012 version has to be certain patch/hot fix? or did you say it was SSAS2008R2 hotfix.

    Question 2: we have similar but different issue again in 2012, see below details, did you have similar experience and what is our work around?

    <<>>

    1. sorry, the details get filtered out as I was using comment characters:

      details: when calculated member is using a named set

      In 2012 version:
      our test result conclusion is:

      if the set3 is defined as
      Exists(SET1,SET2, MeasureGroup)
      or if the set3 is defined as
      Filter(Set1, logical expression)

      then the member which is defined as
      … use the Set3 in the calculated member (our example use function COUNT of set 3

      the performance goes way down for browsing the meta-data (e.g. browse attribute hierarchy members in SSMS) of the cube;

      our workaround for this is:
      copy all definition of set1, set2, set3 into the calculated member, and then get rid of set1,set2,set3; it worked fine.

      we did not test the query which uses the calculated member, because our focus was not the query performance at the time.

      1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        Can you share the complete MDX code?

  3. CREATE MEMBER CURRENTCUBE.[Measures].[The Measure] //This member contains slow sets that should not be separated out (meta-data browsing issue)
    AS
    COUNT
    (
    EXISTING { — this is set3, which is built by set 1 and set 2
    EXISTS(
    NONEMPTY — this is set 1
    (
    [Broadcast].[Broadcast ID].[Broadcast ID].MEMBERS,
    [Measures].[Count of Viewing]
    )
    ,Filter( — this is set 2
    [SessionDegenerate].[Session ID].[Session ID].MEMBERS
    ,[Measures].[Consecutive Duration]>=[Measures].[Reach Condition]
    )
    ,”The Measure Group Name”
    )
    }
    )

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      This is not the scenario described in my post – this set is much more complex. The optimisation only concerns simple expressions where the named set is used in the first parameter of functions like Sum() or Aggregate() and no other functions are used.

      1. Thanks Chris.

        Well, in the beginning, we were defining member like this:

        create dynamic set Set1
        as
        NONEMPTY — this is set 1
        (
        [Broadcast].[Broadcast ID].[Broadcast ID].MEMBERS,
        [Measures].[Count of Viewing]
        )

        create dynamic Set2
        as
        Filter( — this is set 2
        [SessionDegenerate].[Session ID].[Session ID].MEMBERS
        ,[Measures].[Consecutive Duration]>=[Measures].[Reach Condition]
        )

        create dynmaic Set3
        as
        EXISTS(
        set1
        ,set2
        ,”The Measure Group Name”
        )

        CREATE MEMBER CURRENTCUBE.[Measures].[The Measure]
        AS
        COUNT
        (EXISTING set3)

        anyway, even for SUM function, we experience the same thing, below is another example:

        CREATE MEMBER CURRENTCUBE.[Measures].[WeightForConsecutiveReached]
        AS
        SUM(
        EXISTING
        [TheSet] — this is a separate SET which needs to be moved inside to avoid problem of slow.
        ,[Measures].[Weight – H]
        )

        –and TheSet is defined:
        CREATE HIDDEN DYNAMIC SET CURRENTCUBE.[TheSet]
        AS

        EXISTS(
        [OneSet]
        ,Filter( //Do not separate to another set! Performance browsing meta-data in cube will be destroyed
        [SessionDegenerate].[Session ID].[Session ID].MEMBERS
        ,[Measures].[Consecutive Duration]>=[Measures].[Reach Condition]
        )
        ,”Measure Group”
        )

  4. I found named set actually helps for block calculation. I ran the query in sql as 2012, AdventureWorksDW2012Multidimensional-EE.

    with
    set allpro as root([Product])
    member measures.a as
    sum( ( [Date].[Date].&[20050101] : [Date].[Date].&[20051208]) *
    //root([Product])
    allpro
    , [Measures].[Internet Sales Amount])
    select measures.a on 0,
    non empty
    [Customer].[Customer].[Customer] on 1
    from [Adventure Works]

    If I use root([Product]) in measures.a instead of set allpro, it will take longer time and does more cell calculation. I am not sure if root function prevents block calculation. Can someone have any thought?

    York

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      I’m pretty sure the Root() function will prevent block computation – in fact it isn’t really meant to be used in calculations at all, only scope definitions.

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Hi Michael, the link you give documents that named sets prevent block computation in SSAS 2008 and as far as I can see it doesn’t make any distinction between static or dynamic named sets (and I wouldn’t expect there to be one). Equally, the performance improvements in 2012 should apply both to static and to dynamic named sets I think.

      The main reasons calculated members defined on the server perform better than calculated members defined in the WITH clause is that the former can usually have the values return cached for longer than the lifetime of a query – but if you were referencing a dynamic named set in your calculation, that would prevent caching from taking place. I don’t know why there was a difference between the performance of your calculations, and it could be down to a number of factors such as solve order.

      1. Hi Chris, thanks for your fast reply.

        I think I misinterpreted a sentence in the link provided which says in a note (below the section with improved set functions): “Static sets, including the empty set, will also benefit from the expected performance gains.”
        It’s odd that my querys return in a few seconds when I exchange the dynamic sets with static ones (which isn’t a long term option because of the subselect problem) and they take a few minutes to complete with the dynamic sets.

        Here is an example of one member where there is a problem with performance and which is fast when removing the DYNAMIC keyword. Maybe I am doing something completely wrong which prevents block computation (according to Mosha Existing isn’t necessary with named sets but some of my tests showed wrong results when not using it in combination with multiselect in WHERE):

        //Dynamic Sets are used to detect subselects
        CREATE HIDDEN DYNAMIC SET CURRENTCUBE.[Dim1Attr1Members]
        AS [Dim1].[Attr1].[Attr1].ALLMEMBERS;

        //Is Error is used to prevent dimension security errors
        CREATE HIDDEN DYNAMIC SET CURRENTCUBE.[Dim1Attr1J]
        AS IIF(IsError(Count({[Dim1].[Attr1].&[J]})), {}, { [Dim1].[Attr1].&[J] });

        CREATE HIDDEN DYNAMIC SET CURRENTCUBE.[Dim1Attr2N]
        AS IIF(IsError(Count({[Dim1].[Attr2].&[N]})), {}, { [Dim1].[Attr2].&[N] } );

        //”Calculation Base” is a real measure with Sum aggregate
        CREATE MEMBER CURRENTCUBE.[Measures].[Calculation]
        AS Aggregate({Existing [Dim1Attr1J]} * { Existing [Dim1Attr2N]}, [Measures].[Calculation Base]) * 0.25 +
        Aggregate(Except(Existing [Dim1Attr1J], { Existing [Dim1Attr1Members]}) * {Existing [Dim1Attr2N]}, [Measures].[Calculation Base]),
        VISIBLE = 1;

        Thanks in advance.

      2. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        My best guess is that it’s the IsError() function that’s causing your problem here. I would recommend that you create a static named set on the server that uses IfError() to handle the dimension security problems and then reference that named set inside your dynamic named set in the WITH clause.

      3. Hi Chris,

        I now reproduced the observed behavior in Adventure Works Cube. To me this looks clearly like a bug.

        The example calculates the Internet Sales Amount for Bikes according to the selected calendar years irrespective of where the calendar years are selected (Axis, Where, Subselect).

        Steps to reproduce:

        1)

        Add following at the end of Adventure Works Cube Script:

        Create Hidden Dynamic Set CurrentCube.[SelectedCalendarYears]
        AS [Date].[Calendar Year].[Calendar Year].ALLMEMBERS;

        Create Member CurrentCube.[Measures].[PerformanceTest]
        AS Aggregate(Existing [SelectedCalendarYears] * { [Product].[Category].&[1] }, [Measures].[Internet Sales Amount]),
        VISIBLE=1;

        2)

        Execute following Query in SSMS on cold cache:

        SELECT
        {
        [Measures].[PerformanceTest]
        } ON COLUMNS,
        {
        [Customer].[Customer].[Customer].ALLMEMBERS
        } ON ROWS
        FROM (SELECT { { [Date].[Calendar Year].&[2007] } } ON COLUMNS
        FROM [Adventure Works])

        Profiler shows a Duration of 1981ms for Query Cube End and 3870ms for Query End Events.

        3)

        Execute following Query in SSMS on cold cache (same as the first one, except the calculated member is defined in the WITH clause):

        WITH Member [Measures].[xPerformanceTest]
        AS Aggregate([SelectedCalendarYears] * { [Product].[Category].&[1] }, [Measures].[Internet Sales Amount])

        SELECT
        {
        [Measures].[xPerformanceTest]
        } ON COLUMNS,
        {
        [Customer].[Customer].[Customer].ALLMEMBERS
        } ON ROWS
        FROM (SELECT { { [Date].[Calendar Year].&[2007] } } ON COLUMNS
        FROM [Adventure Works])

        Profiler shows a duration of 607ms for Query Cube End and 2394ms for Query End Events. The difference is huge considering that the
        first query should be faster. The difference is even greater on warm cache.

        4)

        Change the lines in the cube script from step 1 to the following:

        Create Hidden Dynamic Set CurrentCube.[SelectedCalendarYears]
        AS [Date].[Calendar Year].[Calendar Year].ALLMEMBERS;

        Create Member CurrentCube.[Measures].[SelectedCalendarYearsStr]
        AS SETTOSTR([SelectedCalendarYears]),
        VISIBLE=1;

        Create Member CurrentCube.[Measures].[PerformanceTest]
        AS Aggregate(Existing STRTOSET(“[SelectedCalendarYears]”) * { [Product].[Category].&[1] }, [Measures].[Internet Sales Amount]),
        VISIBLE=1;

        A new calculated member has been inserted and the reference to [SelectedCalendarYears] has been wrapped in a STRTOSET. Without the calculated member no difference is observed in the next step.

        5)

        Rerun the querys from step 2 and 3 on cold cache and the result is as follows:

        Query 1: 577ms for Query Cube End and 2707ms for Query End events
        Query 2: 607ms for Query Cube End and 611ms for Query End events

        On warm cache both querys are nearly the same.

        So both querys are a lot faster when a calculated member that does nothing but reference the named set is inserted and the real reference is wrapped in a STRTOSET.

      4. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        Yes, I can repro this. I’ve tried a few things and I don’t seem to be able to work around the problem. It’s not a bug (there’s no error or incorrect results) but it looks like a scenario that should be tuned. Sorry…

  5. Hi Michael,

    The ‘slow’ problem you described is similar to the one I commented above, see posts “24 – 28th October 2013” and here: https://connect.microsoft.com/SQLServer/feedback/details/484865/calcuated-member-with-a-reference-to-dynamic-named-set-kills-the-cubes-performance

    In the end we had to move the definition of the dynamic set inside of the calculated member, and got rid of the dynamic set.
    Of course, I saw people were suggesting to use ‘StrToSet/SetToStr’ to avoid this problem too.

    As Chris suggested, my issue was not really related to the original post, so I did not follow up.

    Regards,
    Min

    1. Hi Min,

      you are right it’s similar but not the same since I only experience slow performance when using the calculated member which is referencing the named set. Overall cube performance is not reduced. I even have to include a dummy calculated member ([Measures].[SelectedCalendarYearsStr] in my example above) with a direct reference (not wrapped in StrToSet) to the named set to see the performance gain on real calculated member ([Measures].[PerformanceTest]). Anyway I guess there really is something wrong with dynamic sets in the cube script and I will have to go with workaround.

  6. After struggling with “slow” performance (SSAS 2008 R2) of a query generated by a 3rd party tool, I came across this. Besides being a good reason to upgrade to 2012/2014, why don’t they just allow named sets to work?

    What’s crazy is that I can take the body of the named set, put it in the where close, and it just works fine. There is no code difference between having that code in the where clause directly, and the definision of the named set, except that I have to follow up with adding a second member that aggregates that named set. This is a huge performance hit if your named set is an Except set on a very large dimension…

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Well, I guess if it was easy to get good performance for named sets in aggregate functions they would have implemented it at the start. I don’t know the technical background to this, but I’m sure there are some good reasons why it could only be addressed in SSAS 2012.

Leave a Reply to MinCancel reply