How To Optimise The Performance Of MDX Queries That Return Thousands Of Rows

Chris Webb Analysis Services, MDX August 26, 2015July 17, 2015 2 Minutes

One problem I encounter on a regular basis is how to optimise the performance of MDX queries that return thousands, hundreds of thousands, or even millions of rows. The advice I give is always the same:

Reduce the number of rows that your query returns!

Yes, there are some things you can change in your queries and cube design to improve performance, but these are the same things I’d suggest for any query (build aggregations, rewrite MDX, partition etc etc). In my opinion, if you have a query that returns a ridiculously large number of rows you are doing something fundamentally wrong.

There are three reasons why SSAS developers write this kind of query:

They are doing a data-dump from SSAS to another system. Mostly the developer doesn’t realise this though, since the other system is Excel-based and the end user has disguised their requirement as a report. In most cases, user education about how to use Excel with SSAS results in an approach that doesn’t require dumping thousands of rows of data to an Excel worksheet.I will admit that I have seen a few cases where developers need to dump data out of SSAS for other purposes, and have no option but to use SSAS because they have to add complex calculations that can only feasibly be implemented in MDX. These are very rare though, and most of the time using SQL queries against the underlying relational database works a lot better.
The end users have specified a report that returns lots of data, because that’s just what they want, dammit! Often this is to recreate a report built in a previous system that, at some point in the 1970s, was printed out into a gigantic book every month. My argument here is that a report should return no more data than can be seen on a screen without scrolling. If you need to scroll in a report, you probably should be giving the end user more parameters to filter that report so they can find the data they want to see more easily instead.Of course it’s one thing to know what you should be doing, it’s another thing entirely to tell the CFO that their requirements are stupid. If you can’t convince your end users that you know better than them, you have my sympathy. Usually I find that having to choose between the poor performance of what they want and the better performance of a different approach helps them come to their senses.
Finally, the way that SSRS handles drilling down in reports often leads report developers to bring back vast amounts of data. The advice to increase the number of parameters for filtering is equally relevant here, but you can also use MDX techniques like this one to implement drill down in a much more efficient way.

At the end of the day, SSAS just isn’t optimised for returning large resultsets – it was designed to return PivotTable-style queries, which are always relatively small. You can get good performance for large resultsets if you know what you’re doing, you have the time, and you’re lucky, but you’ll usually be better off rethinking your requirements or choosing a different tool.

Published by Chris Webb

My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel. View all posts by Chris Webb

Published August 26, 2015July 17, 2015

13 thoughts on “How To Optimise The Performance Of MDX Queries That Return Thousands Of Rows”

jerome says:

August 26, 2015 at 1:32 pm

In my case the users think that an OLAP cube is (finally) the access to the raw data. so they try to dump the data out from the prod system through the cube. (else they cant access the data quickly, they have to rely on CR taking weeks of delay)
they open the cube in Excel, dump near all the dimensions in excel (day level, products etc…) resulting in huge resultsets.

the problem here, the IT department dont understand the needs and dont want to provide a selfservice access point (like an OData web site) and they want to force the users to go through the cube.
the users are conducting advanced analysis (like complex basket analysis) which cant be answered by a cube. and IT dont want to create small dedicated cubes… 🙂
so an infinite loop of non communication…
after millions dollars invest in the cube (yes millions), finally the power users are using the old way with CR to get the raw data!!!! (but the project is a success… crazy, no?) 🙂

Loading...

Reply
Erik Svensen says:

August 26, 2015 at 4:56 pm

I totally agree on your solution 🙂 – lets hope the Power BI Desktop visualizations can help us to develop better and less detailed reports and we can move away from hopelessly long SSRS reports that are exported to PDF and/or exported to Excel.

Loading...

Reply
Davide Mauri says:

August 26, 2015 at 9:11 pm

While I face the same problems you mention (users that wants to download tons of data), I must say that problems does not come from MDX or the Cube itself, but from the queries that Excel produces when you use PivotTable. Rewriting the same queries with a better MDX (that, for example, avoid to extract subtotals when you don’t want them) allows to have very good performance…it’s quite common to extract up to 600.000 rows (don’t ask why the end user “needs” it…) in couple of seconds. Of course we had to write our own Excel plugin in order to rewrite the generated MDX on the fly….

Loading...

Reply
1. Chris Webb says:
  
  August 27, 2015 at 1:14 am
  
  Hi Davide, I know about the problems with Excel subtotals and the MDX it generates, and I have blogged about the workarounds several times. For fairly simple queries you’re right, you can get good performance, but it’s not always the case. The point I wanted to make here was that it’s the requirement itself that needs to be questioned.
  
  Loading...
  
  Reply
2. Aleks says:
  
  August 29, 2015 at 2:07 pm
  
  Hi Davide,
  A plug-in exists somewhere?
  
  Loading...
  
  Reply
  1. Davide Mauri says:
    
    September 1, 2015 at 10:54 am
    
    Nope we created it for one of our client. Basically we get all the objects used in the OLAP Pivotable and then we create a new “standard” pivottable on-the fly writing the MDX on our own, and this optimizing it *a lot*.
    
    Loading...
agilmore30 says:

August 27, 2015 at 12:56 am

Thanks Chris.

I see this issue on occasion too. I always describe SSAS as a high performance AGGREGATED query engine. It’s a poor non-aggregated query engine. For non-aggregated queries, I recommend querying off the underlying relational data warehouse.

Cheers – Adam .

Loading...

Reply
1. Imke Feldmann says:
  
  August 27, 2015 at 8:41 pm
  
  Hi Adam,
  would you then import the data on detail level into your cube at all or take the lowest aggregation level as grain level for your cube instead?
  Thanks, Imke
  
  Loading...
  
  Reply
  1. agilmore30 says:
    
    August 28, 2015 at 1:53 am
    
    Hi Imke
    
    In principle, I try to include the lowest grain (i.e. detail level) of fact tables in the cube. This is ‘Kimball’ best practice. It’s difficult to try to guess what grain users may want. Rolling up to a higher grain may cause you issues, when you try to associate the fact with dimensions, where the lowest grain of the dimension is the detail level. What key do you associate on? I think Chris wrote a post on this recently. There are some exceptions of course. One example may be where you have some detector transactions which capture data each second. In your Cube, if the lowest grain of your time dimension was minute or hour, then you might want to aggregate that detector data to the minute/hour when loading it into the cube.
    
    Hope that helps – Adam.
    
    Loading...
  2. Imke Feldmann says:
    
    August 28, 2015 at 6:58 am
    
    Yes, thanks – that answered my question.
    
    Loading...
Pingback: Dew Drop – August 27, 2015 (#2078) | Morning Dew
Jamie says:

September 1, 2015 at 10:13 am

If your query tool supports SSAS Actions you can make use of the ASSP.GetCustomDrillthroughMDX procedure discussed here (http://www.sqlservercentral.com/Forums/Topic1598965-147-1.aspx).
I have successfully used this (accessed within Pyramid Analytics), and the performance is surprisingly good considering the number of records being returned. It keeps the users very happy.

Loading...

Reply
Pingback: Monitoring SSAS Multidimensional MDX Query Results Serialisation, Part 2 – Chris Webb's BI Blog