NON EMPTY limitations

I was putting together a ridiculously large query for my previous post to try to test performance of the HAVING clause, and tried to run the following on Adventure Works:
SELECT [Measures].[Internet Order Quantity] ON 0,
ON 1
FROM [Adventure Works]
To my surprise, I got the following error message:

Executing the query …

The expression contains a function that cannot operate on a set with more than 4,294,967,296 tuples.

Execution complete

Obviously some kind of internal limit here to do with NON EMPTY: the query runs if you remove it. Now 4,294,967,296 tuples is a lot of tuples but imagine that you had a million customers and ten thousand products, and you wanted to find out which customers had bought which products – you’d want to find the non empty combinations in a set of tuples which is still larger than that limit. OK in any normal cube this would probably bring back far too many rows to be a useful query, but what if you had a very specific slice in the WHERE clause (eg adding WHERE([Ship Date].[Date].&[1])  onto the end of the above query still gives the same message) so that you’d only expect to return a very small number of rows in your results? There are certainly going to be ways to work around this if you are writing your own MDX, but I wonder if any of the commercially available client tools which use NON EMPTY know to do this yet? Hmm…

14 thoughts on “NON EMPTY limitations

  1. Chris,we came around this error quite a number of times (working with Reporting Services)… We were told that this is a beta problem… And from CTP to CTP we got the problem less frequently… But as you see, the problem is still there… I\’ll check if we have some more information about that…Thomas

  2. Could you tell me who (what group) it was that told you this was a bug in the beta — was there an article on this that my team and I can look into further?Have you found any resolutions at all around this problem?Thanks,Angela

  3. Angela,I would guess that Thomas was probably told it was a bug by the Analysis Services dev team; I\’ve certainly not seen any public information on this issue and I wouldn\’t expect there to be any until it was fixed.The workaround would be to use the NonEmpty function to filter each set before you crossjoin, something like this:SELECT [Measures].[Internet Order Quantity] ON 0,GENERATE(NONEMPTY([Customer].[Customer].MEMBERS,[Measures].[Internet Order Quantity]),NONEMPTY([Customer].[Customer].CURRENTMEMBER*[Date].[Date].MEMBERS*[Product].[Product].MEMBERS*[Geography].[City].MEMBERS,[Measures].[Internet Order Quantity]))ON 1FROM [Adventure Works]

  4. Chris,
    do you have any update on that? I now get into the same problem again… And: I have one box with SQL 2005 RTM, my colleague is using SP1. I\’m getting the error, he doesn\’t get it but the query runs "forever"…

  5. As far as I understand it, the problem happens when the engine has to evaluate a set of this size – but this only tends to happen in practice when calculations which don\’t use bulk evaluation/block computation are involved, because when your calculations do use this or you have real measures then the engine can work around it itself. So I think the key is to rewrite your calculation to be more efficient and the problem should disappear.

  6. Well, the above query now returns an out-of-memory exception from SQL MS when I try to run in on AS2008, so it seems AS is happy to return the resultset – it\’s just that SQLMS can\’t display it! However I\’m not sure if the limitation has gone or the threshold where the this error was raised has just got higher. To be honest I hadn\’t hit this error in AS2005 since SP2 or thereabouts, so it might have been fixed some time ago.

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Can you be a bit more specific with what you’re doing please? You shouldn’t be using NonEmpty inside a scope statement…

  7. Hi Chris

    I am getting this error on SSAS 2012. I can’t reduce it to a specific calculation and in fact if i comment out random measures it works. My thinking is there is a relationship somewhere that’s exploding out tuples although i am not seeing it when i slice by them individually i don’t see a Cartesian result. can you give any tips to troubleshoot this?


    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Hi Wayne, usually this problem is associated with inefficient MDX calculated members, and the best solution is to rewrite them so they can be evaluated more efficiently by the engine. If you have control over the MDX query that your client tool is generating then you can also work around the problem by rewriting your MDX queries, but I recommend fixing the calculations.

  8. hi Chris

    the calculations are all straightforward division and multiplication with exception to some formulas being comprised of other calculated measures with case statements. In other words i have some instances where i have several calculated measures (division + multiplication) and other measures that case several formulas and return one of them based on flags or values. not sure how i can rewrite this. On thing i should note is the layout is:

    1 measure (about 10 million rows)
    1 measure (about 50k rows)

    there are several dimensions that span them both and calculations are scoped on a single relationship (the most granular). When querying i am returning dimensions that are unrelated to the largest measure which has a many-many on those dimensions with the smaller measure. My thinking it is this design of those dimension relationships that is causing a tuple explosion. I am about to test this theory out by pulling those keys over and natrualising the relationships to see if this reduces the tuples.

    I am doing this design to save processing time on regular refreshes because only the smaller measure changes daily. This design is to replace one that calculates the results in SQL and served up with a single measure.

Leave a ReplyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.