Troubleshooting Web Service Refresh Problems In Power BI With The Power Query Diagnostics Feature

Back in 2018 I wrote a blog post about how to troubleshoot web service refresh issues using a third party tool called Fiddler. It’s still relevant but Fiddler is a complex tool and installing it and giving it the right permissions to see everything it needs to see isn’t easy. Luckily, the new Power Query Diagnostics feature in Power BI (released in October 2019) means you don’t need to use Fiddler for this kind of work any more. Full details on how to use this feature are given here, but I thought it would be useful to focus on one particular use case for it.

Let’s take one of the Power Query queries from my previous post as an example:

let
    Source = OData.Feed("https://api.parliament.uk/odata"),
    GovernmentOrganisation_table =
    Source{[
        Name="GovernmentOrganisation",
        Signature="table"
    ]}[Data],
    #"Filtered Rows" =
    Table.SelectRows(
        GovernmentOrganisation_table,
        each ([GroupName] = "Cabinet Office")
    )
in
    #"Filtered Rows"

[Note that this is a public web service which requires no authentication, so you will be able to run this query yourself]

This query connects to the UK Houses of Parliament OData API and returns the table of government organisations filtered down to the row where the GroupName column equals the text “Cabinet Office”. What requests are made to the web service when this query runs?

The first thing to point out is that different things will happen depending on whether you refresh the preview window in the Power Query Editor or refresh the dataset inside the main Power BI window. If you’re doing performance tuning you should ignore what happens when you refresh the preview in the Power Query Editor and always refresh from the main Power BI window.

The second is that, when you refresh a dataset (or a table within a dataset) from the main Power BI window, you might find that more work is being done than you expect as a result of “background refresh” taking place. This is something I blogged about here and you should always turn it off before doing any performance tests, because it’s not something that will happen in the service and it can also slow things down – potentially a lot if there are lots of queries and steps. You can turn it off from the Options dialog in the Data Load tab by deselecting “Allow data preview to download in the background”.

image 

The third is that the Query Diagnostics feature I’m going to use is, at the time of writing, in Preview, so you may need to enable it in the Preview features pane of the Options dialog:

image

With that done, here are the steps to find out what requests are made to the web service when the above query is run.

  1. First, let the query load data into the dataset as usual without doing anything else. You’ll see the table that the query has loaded data into in the Fields pane of the main Power BI window:
    image
  2. Next, reopen the Power Query Editor window, go to the Tools tab and click on the Start Diagnostics button.
    image
  3. Do not do anything else in the Power Query Editor window, instead minimise (do not close) it and then go back to the main Power BI window. There, right-click on the table that holds the output of the query in the Fields pane, right-click and select “Refresh data”:
    image
  4. Once the refresh has taken place, go back to the Power Query Editor window and click the Stop Diagnostics button:
    image
  5. When you have done that, you’ll see two new queries created in a group called Diagnostics:
    image
    These queries contain the diagnostics information for the refresh you’ve just done. The query with “Detailed” in its name has all the detailed information; the other query contains summarised data.
  6. Looking at the “Detailed” query, each row represents a single operation that takes place within the Power Query engine when the refresh took place and while I don’t know what all this means (I’m working on it!), if you go to the Data Source Query column you’ll see a list of all the HTTP requests made to the web service:
    image
  7. Clicking on an individual cell in this column allows you to see the full text of the request and response made in the preview pane underneath the results pane:
    image

While Fiddler may still be necessary for some more advanced scenarios, this is a much easier way of troubleshooting web service data sources in the Power Query Editor, for example when you need to check the exact request made to see if query folding is taking place on an OData data source.

Excel Dynamic Arrays And The CubeSet Function

Among all the exciting announcements made at Ignite last week, one you may have missed (even in all the Excel-related announcements here) was that dynamic arrays have finally reached GA. Ever since they were announced I’ve been interested in how they can be used with Excel cube functions, which allow you to get data from Analysis Services, Power BI and the Excel Data Model/Power Pivot into cells on the worksheet, and I’ve blogged about this once already. Even though right now the CubeValue function can’t be used with dynamic arrays – alas – there are still some interesting ways dynamic arrays and cube functions can be used together and in this post I’ll show you another one I’ve found.

Say you have the following set of tables in the Excel Data Model:

image

And the following measure that counts the number of customers who bought something:

Distinct Customers:=DISTINCTCOUNT('Internet Sales'[CustomerKey])

Let’s also say you have on a worksheet the following table of data showing the number of distinct customers broken down by product, created using the CubeMember and CubeValue functions:

image

Here are the formulas for this table:

image

So far, all very straightforward. Now let’s say you want to use a dynamic array to filter the rows in this table so you only see the products that have more than a certain number of distinct customers, and you want to see a grand total for this filtered list of customers, like so:

ExcelDynamicArray

Here are the formulas to achieve this:

image

To explain what’s going on here:

  • Cell C2 contains the text “ThisWorkbookDataModel”, the name of the connection to the Excel Data Model. Doing this makes the other Excel formulas here much easier to read!
  • Cell F2 – which I’ve turned into a named range called FilterThreshold – contains the number of customers which a product must exceed to be shown.
  • Cell E5 contains the dynamic array Filter function that filters the rows shown in the first two screenshots above and only returns the products that exceed the threshold. The formula is:
    FILTER(B5:B134,C5:C134>FilterThreshold)
  • Cell E6 does something similar to E5 but returns the number of distinct customers for each product. If all I wanted was the products and the number of distinct customers I wouldn’t have needed two separate formulas, I could have used one, but I need to add the grand total too…
  • Cell E4 contains the formula:
    CUBESET($C$2, E5#, “Total”)
    This creates a set from the spill reference E5#, which contains the cells containing CubeMember formulas returned by the Filter function. For some reason CubeSet returns an error if you try to use a dynamic array formula in its second parameter but using a spill reference works ok; this is why I had to use separate formulas in E5 and E6.
  • Finally, cell F4 contains the formula that returns the total distinct count for all products returned in the filtered list using CubeValue and the set created in cell E4:
    CUBEVALUE($C$2, “[Measures].[Distinct Customers]”, E4)
    Since this is a distinct count the only way to get the correct value here is using CubeSet, CubeValue and the DAX measure – there would be no way to calculate the correct value using Excel formulas. The same could be said of almost any other DAX measure that wasn’t a simple aggregation.

You can download the sample workbook for this post here.

Power BI Report Builder And RSCustomDaxFilter

If you’re building DAX queries using Power BI Report Builder you might notice something that looks like a new DAX function called RSCustomDaxFilter. For example, here’s a simple DAX query built from the Adventure Works Tabular demo database, with one measure and one parameter built Calendar Year:

image

Note that the “Enable Multi Value Parameters” option has been selected. Here’s what you’ll see if you view the text of the DAX query in the Query Designer:

EVALUATE 
SUMMARIZECOLUMNS(
RSCustomDaxFilter(
 @DateCalendarYear,
 EqualToCondition,
 [Date].[Calendar Year],
 Int64
 ), 
"Internet Total Sales", 
[Internet Total Sales]
)

But what is RSCustomDaxFilter? If you run your report and see what happens on the server using SQL Server Profiler, here’s the query that actually gets run (in this case I selected the years 2013 and 2014 for the parameter):

EVALUATE
SUMMARIZECOLUMNS (
    FILTER (
        VALUES ( 'Date'[Calendar Year] ),
        ( 'Date'[Calendar Year] = VALUE ( "2013" ) )
            || ( 'Date'[Calendar Year] = VALUE ( "2014" ) )
    ),
    "Internet Total Sales", [Internet Total Sales]
)

What has happened is that RSCustomDaxFilter has been replaced with an expression using the DAX Filter() function that implements the filter on the selected years; it’s just a placeholder for a dynamically-generated DAX expression that is substituted in at runtime.

Why is it needed? Handling multi-value parameters is difficult in DAX when you don’t know how many values are going to be passed to the parameters (it’s a subject I’ve blogged about here and here) and some kind of dynamic code generation is a reasonable solution to this problem.

There is one drawback with this approach though – it can generate a DAX query that is too long to be executed. Here’s the error message you’ll see:

image

The specified query is too complex to be evaluated as a single statement.

I’m not sure what the maximum length of a query is in DAX – I suspect it’s 32768 characters. You’ll hit this limit if you create a parameter on a field with thousands of values in and then select all but a few of these values; from the example above you can imagine how long the resulting DAX query could be:

image

There’s no single workaround for this but some custom DAX (such as the example here) will be necessary; you’ll just need to ensure that the DAX query is as short as possible.

%d bloggers like this: