Recently I was involved in troubleshooting a mysterious Analysis Services Multidimensional performance problem for a customer: the team worked out that certain queries run by certain users were extremely slow, and that these users were members of roles where dimension security was applied, but the amount of slowdown – queries going through the role were taking over 10 minutes compared to a few seconds when run as an administrator – was unlike anything I had seen before. It turned out that the cause was having the Enable Visual Totals box checked on every attribute on the dimension where security was applied, not just the attributes whose members were secured.
I can’t reproduce the problem with the Adventure Works cube but I can use it to illustrate the problem. Let’s say you have a role that applies dimension security on the Country attribute of the Customer dimension:
Normally, in this scenario, you would only check the Enable Visual Totals box for the Country attribute:
When running a query with this role applied, in the Query Subcube Verbose event in Profiler you will see a slice is put on the Country attribute:
However, if the Enable Visual Totals box is checked for every attribute on the dimension then a slice is put on every attribute that has its hierarchy enabled:
The more of these slices there are the slower everything gets inside Analysis Services: slower scans, slower cache registry lookups and so on. In the case of the cube I was looking at the combination of all of these slices, extremely complex MDX calculations and unprocessed indexes led to the massive performance problem. Obviously if you have to use Enable Visual Totals on your role then you have to use it, and it’s extremely unlikely you will encounter this problem, but it’s good to know about it just in case.
Back in 2018 I wrote a blog post about how to troubleshoot web service refresh issues using a third party tool called Fiddler. It’s still relevant but Fiddler is a complex tool and installing it and giving it the right permissions to see everything it needs to see isn’t easy. Luckily, the new Power Query Diagnostics feature in Power BI (released in October 2019) means you don’t need to use Fiddler for this kind of work any more. Full details on how to use this feature are given here, but I thought it would be useful to focus on one particular use case for it.
Let’s take one of the Power Query queries from my previous post as an example:
Source = OData.Feed("https://api.parliament.uk/odata"),
#"Filtered Rows" =
each ([GroupName] = "Cabinet Office")
[Note that this is a public web service which requires no authentication, so you will be able to run this query yourself]
This query connects to the UK Houses of Parliament OData API and returns the table of government organisations filtered down to the row where the GroupName column equals the text “Cabinet Office”. What requests are made to the web service when this query runs?
The first thing to point out is that different things will happen depending on whether you refresh the preview window in the Power Query Editor or refresh the dataset inside the main Power BI window. If you’re doing performance tuning you should ignore what happens when you refresh the preview in the Power Query Editor and always refresh from the main Power BI window.
The second is that, when you refresh a dataset (or a table within a dataset) from the main Power BI window, you might find that more work is being done than you expect as a result of “background refresh” taking place. This is something I blogged about here and you should always turn it off before doing any performance tests, because it’s not something that will happen in the service and it can also slow things down – potentially a lot if there are lots of queries and steps. You can turn it off from the Options dialog in the Data Load tab by deselecting “Allow data preview to download in the background”.
The third is that the Query Diagnostics feature I’m going to use is, at the time of writing, in Preview, so you may need to enable it in the Preview features pane of the Options dialog:
With that done, here are the steps to find out what requests are made to the web service when the above query is run.
First, let the query load data into the dataset as usual without doing anything else. You’ll see the table that the query has loaded data into in the Fields pane of the main Power BI window:
Next, reopen the Power Query Editor window, go to the Tools tab and click on the Start Diagnostics button.
Do not do anything else in the Power Query Editor window, instead minimise (do not close) it and then go back to the main Power BI window. There, right-click on the table that holds the output of the query in the Fields pane, right-click and select “Refresh data”:
Once the refresh has taken place, go back to the Power Query Editor window and click the Stop Diagnostics button:
When you have done that, you’ll see two new queries created in a group called Diagnostics:
These queries contain the diagnostics information for the refresh you’ve just done. The query with “Detailed” in its name has all the detailed information; the other query contains summarised data.
Looking at the “Detailed” query, each row represents a single operation that takes place within the Power Query engine when the refresh took place and while I don’t know what all this means (I’m working on it!), if you go to the Data Source Query column you’ll see a list of all the HTTP requests made to the web service:
Clicking on an individual cell in this column allows you to see the full text of the request and response made in the preview pane underneath the results pane:
While Fiddler may still be necessary for some more advanced scenarios, this is a much easier way of troubleshooting web service data sources in the Power Query Editor, for example when you need to check the exact request made to see if query folding is taking place on an OData data source.
Among all the exciting announcements made at Ignite last week, one you may have missed (even in all the Excel-related announcements here) was that dynamic arrays have finally reached GA. Ever since they were announced I’ve been interested in how they can be used with Excel cube functions, which allow you to get data from Analysis Services, Power BI and the Excel Data Model/Power Pivot into cells on the worksheet, and I’ve blogged about this once already. Even though right now the CubeValue function can’t be used with dynamic arrays – alas – there are still some interesting ways dynamic arrays and cube functions can be used together and in this post I’ll show you another one I’ve found.
Say you have the following set of tables in the Excel Data Model:
And the following measure that counts the number of customers who bought something:
Let’s also say you have on a worksheet the following table of data showing the number of distinct customers broken down by product, created using the CubeMember and CubeValue functions:
Here are the formulas for this table:
So far, all very straightforward. Now let’s say you want to use a dynamic array to filter the rows in this table so you only see the products that have more than a certain number of distinct customers, and you want to see a grand total for this filtered list of customers, like so:
Here are the formulas to achieve this:
To explain what’s going on here:
Cell C2 contains the text “ThisWorkbookDataModel”, the name of the connection to the Excel Data Model. Doing this makes the other Excel formulas here much easier to read!
Cell F2 – which I’ve turned into a named range called FilterThreshold – contains the number of customers which a product must exceed to be shown.
Cell E5 contains the dynamic array Filter function that filters the rows shown in the first two screenshots above and only returns the products that exceed the threshold. The formula is:
Cell E6 does something similar to E5 but returns the number of distinct customers for each product. If all I wanted was the products and the number of distinct customers I wouldn’t have needed two separate formulas, I could have used one, but I need to add the grand total too…
Cell E4 contains the formula:
CUBESET($C$2, E5#, “Total”)
This creates a set from the spill reference E5#, which contains the cells containing CubeMember formulas returned by the Filter function. For some reason CubeSet returns an error if you try to use a dynamic array formula in its second parameter but using a spill reference works ok; this is why I had to use separate formulas in E5 and E6.
Finally, cell F4 contains the formula that returns the total distinct count for all products returned in the filtered list using CubeValue and the set created in cell E4:
CUBEVALUE($C$2, “[Measures].[Distinct Customers]”, E4)
Since this is a distinct count the only way to get the correct value here is using CubeSet, CubeValue and the DAX measure – there would be no way to calculate the correct value using Excel formulas. The same could be said of almost any other DAX measure that wasn’t a simple aggregation.
You can download the sample workbook for this post here.
If you’re building DAX queries using Power BI Report Builder you might notice something that looks like a new DAX function called RSCustomDaxFilter. For example, here’s a simple DAX query built from the Adventure Works Tabular demo database, with one measure and one parameter built Calendar Year:
Note that the “Enable Multi Value Parameters” option has been selected. Here’s what you’ll see if you view the text of the DAX query in the Query Designer:
"Internet Total Sales",
[Internet Total Sales]
But what is RSCustomDaxFilter? If you run your report and see what happens on the server using SQL Server Profiler, here’s the query that actually gets run (in this case I selected the years 2013 and 2014 for the parameter):
VALUES ( 'Date'[Calendar Year] ),
( 'Date'[Calendar Year] = VALUE ( "2013" ) )
|| ( 'Date'[Calendar Year] = VALUE ( "2014" ) )
"Internet Total Sales", [Internet Total Sales]
What has happened is that RSCustomDaxFilter has been replaced with an expression using the DAX Filter() function that implements the filter on the selected years; it’s just a placeholder for a dynamically-generated DAX expression that is substituted in at runtime.
Why is it needed? Handling multi-value parameters is difficult in DAX when you don’t know how many values are going to be passed to the parameters (it’s a subject I’ve blogged about here and here) and some kind of dynamic code generation is a reasonable solution to this problem.
There is one drawback with this approach though – it can generate a DAX query that is too long to be executed. Here’s the error message you’ll see:
The specified query is too complex to be evaluated as a single statement.
I’m not sure what the maximum length of a query is in DAX – I suspect it’s 32768 characters. You’ll hit this limit if you create a parameter on a field with thousands of values in and then select all but a few of these values; from the example above you can imagine how long the resulting DAX query could be:
There’s no single workaround for this but some custom DAX (such as the example here) will be necessary; you’ll just need to ensure that the DAX query is as short as possible.