Back in March, with the publication of the Dynamics 365 Release Notes document (which you can download here, halfway down the page), we learned a lot about the roadmap of Power BI; my blog post here summarises the announcements. However, on Twitter Matthew Roche reminded me that it is a living document and indeed it turns out that it has been updated several times since March. So what has been added since then about the future of Power BI?
The change history section at the start helps identify what’s new in the document, but it’s not easy to tell what genuinely new Power BI features have been announced. There are plenty of changes to availability dates for sure. Here’s what little I’ve found in terms of new, interesting stuff (page numbers are for document version 18.1.2):
Filtering in the data view (p205) – finally! This is a really useful feature from Excel’s Power Pivot window that allows you to sort and filter the data that is shown in the data view after you have loaded data into tables in your dataset:
Various improvements to SAP BW and SAP Hana data connectors (p241) – I don’t think some of the details here have been officially announced yet, but I’m not an SAP person so I could be wrong.
Power BI custom connectors written in M will also work in Flow, PowerApps and Logic Apps (p261). This is something that Matt Masson talked about in his session at the Dublin Power BI conference too, but which I don’t think has been mentioned anywhere else. This makes custom connectors even more powerful!
Let me know if you find anything else! I’ll need to remember to check for changes to this document on a regular basis…
About two years ago I wrote a blog post describing how the #table M function can be used to generate tables, but in that post I only covered the functionality I used regularly – namely using #table with a list of column names or a table type in the first parameter. However there two other variations on #table that I have used recently that I thought were worth pointing out.
For example, if you need to generate a table with a set number of columns but you don’t care what the columns are called, you can use an integer in the first parameter to get a table with that number of columns. The following expression returns a table with four columns of data type Any called Column1, Column2, Column3 and Column3, and no rows:
Also, if you have a list of lists with an unknown number of items in and you want to use each nested list for the row values in a table, you can use a null value in the first parameter of #table. The following expression returns a table with four columns like the one above, but with two rows of integer values:
Back in 2012 I wrote a blog post explaining how to handle multi-value parameters in DAX queries in Reporting Services reports. The approach I took back then was basically:
Generate a string containing a pipe-delimited list of all the parameter values that the user has selected (I did this in DAX too, but to be honest it’s better to use the SSRS Join function to do this outside the DAX query)
Use the DAX PathContains() function in a Filter() to check whether the value on the current row of the table being filtered appears in the pipe-delimited list
Here’s a deliberately simplified example of how this works based on Adventure Works DW data. The following query filters the FactInternetSales fact table and returns the rows for the Sales Order Numbers that are present in the OrderList variable:
The trouble with this approach is that is that it can be very slow. Running a trace in DAX Studio for the query above reveals the problem:
The presence of CallbackDataID shows that the Storage Engine is calling the Formula Engine to handle the use of PathContains() in the filter, and this is often a cause of poor query performance. However back when I wrote the post the only alternative was, as Chris Koester points out here, to dynamically generate the entire DAX query as an SSRS expression and that is very painful to do.
The good news is that recent changes in DAX mean that there is another way to tackle this problem that can give much better performance. Here’s an example of this new approach:
VAR OrderList = "SO43713|SO43758|SO43784|SO43821"
VAR OrderCount =
PATHLENGTH ( OrderList )
VAR NumberTable =
GENERATESERIES ( 1, OrderCount, 1 )
VAR OrderTable =
VAR CurrentKey = [Value]
ROW ( "Key", PATHITEM ( OrderList, CurrentKey ) )
VAR GetKeyColumn =
SELECTCOLUMNS ( OrderTable, "Key", [Key] )
VAR FilterTable =
TREATAS ( GetKeyColumn, FactInternetSales[SalesOrderNumber] )
CALCULATETABLE ( FactInternetSales, FilterTable )
Broken down variable by variable, here’s how it works:
OrderList is the pipe-delimited list of key values passed from SSRS
OrderCount uses the PathLength() DAX function to find the number of parameter values in this list
NumberTable uses the GenerateSeries() function to create a table of numbers with one row for each number between 1 and the number of parameter values in the list
OrderTable uses the trick Marco describes here to iterate over NumberTable and, for each row, uses the PathItem() function to return one parameter value from the list for each row in the able
GetKeyColumn uses the SelectColumns() DAX function to only return the column from OrderTable that contains the parameter values
FilterTable uses the TreatAs() DAX function to take the table of values returned by GetKeyColumn and treat them as values in the FactInternetSales[SalesOrderNumber] column
Finally, the query returns the contents of the FactInternetSales table filtered by the values in FilterTable using the CalculateTable() DAX function
There’s a lot of extra code here and in some cases you may find that performance with smaller data volumes is worse as a result, but in this particular case the new approach is twice as fast at the old one. There’s certainly no CallBackDataID:
Although the Power Query engine generates MDX queries when importing from Analysis Services in the same way it generates SQL queries when it imports from a relational database, the View Native Query option doesn’t work for Analysis Services data sources. You can of course use a Profiler trace or xEvents to see the MDX, but for most users that will not be an option for security reasons. The UI does tell you in another way though. If query folding is taking place for a step, then the Cube Tools menu on the ribbon (with the Add Items and Collapse Columns buttons) will be available, and in the top right-hand corner of the table in the results area there will be a cube icon:
If query folding is not taking place for a step (even though it might be taking place for previous steps in the query) then the Cube Tools menu will not be visible, and the cube icon will be replaced by a table icon. For example, in the following screenshot an Index column has been added to the query shown above, so query folding is not taking place from this step on:
A lot of people have problems with the performance of OData data sources when loading data into Power BI and Excel. One possible cause of these problems is query folding not taking place – if this is the case then the Power Query engine will be requesting more data that is needed and applying any filters itself locally, rather than requesting filtered data from the data source. How do you know whether query folding is taking place or not though? The Power Query Editor UI doesn’t give you this information, unfortunately. Last week, at the Data and BI Summit in Dublin, Matt Masson demonstrated how to use Fiddler to check if query folding is taking place when loading data from an OData data source and he’s very kindly allowed me to write about what he showed here.
The following examples use the UK Parliament’s public OData API which is documented here. The M query below, generated using the Power Query Editor in Power BI Desktop, returns all the rows from the Government Organisation table from the API:
Source = OData.Feed("https://api.parliament.uk/odata"),
To monitor communication between Power BI Desktop and the OData API you will need to install Fiddler, a free tool from Telerik. You can download it here:
You’ll probably also need to configure Fiddler to intercept https traffic, which you can find out how to do here. It’s a very powerful tool and I am by no means an expert in using it, but even a basic understanding of its features can be very useful for a Power BI developer.
With Fiddler running you can add a filter so that it only shows traffic to certain hosts; in this case I’m filtering to only show traffic to api.parliament.uk:
Refreshing the query shown above in the Query Editor results in the following activity being shown in Fiddler:
There are several calls to the service root URL, but the important call is to get the top 1000 rows from the GovernmentOrganisation table:
Notice the use of the $top query option to restrict the number of rows returned – the first example of the Power Query engine pushing a filter back to the data source. This only happens when you refresh the query in the Power Query Editor so you can see a sample of the data; when you click the Close and Apply button and load the query into your dataset you’ll see that this $top filter is not applied and all the rows from the table are requested.
Altering the query to filter the rows down to where the GroupName column equals “Cabinet Office” like so:
Results in the following M query:
Source = OData.Feed("https://api.parliament.uk/odata"),
#"Filtered Rows" =
each ([GroupName] = "Cabinet Office")
Fiddler shows the following call that includes the filter:
[The Power Query Editor does quite a lot of caching so you may need to click the Refresh button on the ribbon to make sure it actually does call the API]
It can be quite difficult to work out what’s going on in a call like this, so with the highlighted row in the screenshot above selected you can go to the Inspectors tab and then the WebForms sub-tab, and it will show the different query options used:
In this case you can see the $filter query option has been used to do the filter, so query folding has taken place in this case and the API is only returning the one row that the query returns:
However it is all too easy to do something in your query that prevents query folding from happening. For example if you add an index column to the table before filtering by GroupName, as shown in this query:
…then Fiddler shows that only the $top filter is being applied in the call to the API, not the $filter on GroupName, so query folding is no longer taking place for the filter step:
Not everything you do in your M queries can or will be folded back to an OData API, but in general you should aim to get query folding to take place on the steps in your query that reduce the amount of data returned by the API the most. Row filters like the one shown above are a prime example of the type of transformation that will need to be folded in order to get the best possible data refresh performance. You may need to experiment with reordering steps and applying transformations in different ways to get the Power Query engine to call the API in the way you want.