More Details On Creating Tables In Power BI/Power Query M Code Using #table()

About two years ago I wrote a blog post describing how the #table M function can be used to generate tables, but in that post I only covered the functionality I used regularly – namely using #table with a list of column names or a table type in the first parameter. However there two other variations on #table that I have used recently that I thought were worth pointing out.

For example, if you need to generate a table with a set number of columns but you don’t care what the columns are called, you can use an integer in the first parameter to get a table with that number of columns. The following expression returns a table with four columns of data type Any called Column1, Column2, Column3 and Column3, and no rows:

#table(4,{})

image

Also, if you have a list of lists with an unknown number of items in and you want to use each nested list for the row values in a table, you can use a null value in the first parameter of #table. The following expression returns a table with four columns like the one above, but with two rows of integer values:

#table(null, {{1,2,3,4},{2,3,4,5}})

image

How To Tell Whether Query Folding Is Taking Place When Importing Data From Analysis Services In Power BI And Excel Power Query

As a quick follow-on from last week’s post on how to detect whether query folding is taking place when importing from OData data sources, if you’re importing data from Analysis Services you have a similar problem: how do you know whether query folding is taking place? Ensuring that query folding takes place for as many of the steps in your query – especially those that filter or otherwise reduce the amount of data returned – is very important for data refresh performance.

Although the Power Query engine generates MDX queries when importing from Analysis Services in the same way it generates SQL queries when it imports from a relational database, the View Native Query option doesn’t work for Analysis Services data sources. You can of course use a Profiler trace or xEvents to see the MDX, but for most users that will not be an option for security reasons. The UI does tell you in another way though. If query folding is taking place for a step, then the Cube Tools menu on the ribbon (with the Add Items and Collapse Columns buttons) will be available, and in the top right-hand corner of the table in the results area there will be a cube icon:

image

If query folding is not taking place for a step (even though it might be taking place for previous steps in the query) then the Cube Tools menu will not be visible, and the cube icon will be replaced by a table icon. For example, in the following screenshot an Index column has been added to the query shown above, so query folding is not taking place from this step on:

image

[Thanks to Jure Jaklic for pointing this out]

Troubleshooting Data Refresh Performance Issues With OData Data Sources In Power BI And Excel Using Fiddler

A lot of people have problems with the performance of OData data sources when loading data into Power BI and Excel. One possible cause of these problems is query folding not taking place – if this is the case then the Power Query engine will be requesting more data that is needed and applying any filters itself locally, rather than requesting filtered data from the data source. How do you know whether query folding is taking place or not though? The Power Query Editor UI doesn’t give you this information, unfortunately. Last week, at the Data and BI Summit in Dublin, Matt Masson demonstrated how to use Fiddler to check if query folding is taking place when loading data from an OData data source and he’s very kindly allowed me to write about what he showed here.

The following examples use the UK Parliament’s public OData API which is documented here. The M query below, generated using the Power Query Editor in Power BI Desktop, returns all the rows from the Government Organisation table from the API:

let
    Source = OData.Feed("https://api.parliament.uk/odata"),
    GovernmentOrganisation_table =
	Source{[
		Name="GovernmentOrganisation",
		Signature="table"
		]}[Data]
in
    GovernmentOrganisation_table

image

To monitor communication between Power BI Desktop and the OData API you will need to install Fiddler, a free tool from Telerik. You can download it here:

https://www.telerik.com/fiddler

You’ll probably also need to configure Fiddler to intercept https traffic, which you can find out how to do here. It’s a very powerful tool and I am by no means an expert in using it, but even a basic understanding of its features can be very useful for a Power BI developer.

With Fiddler running you can add a filter so that it only shows traffic to certain hosts; in this case I’m filtering to only show traffic to api.parliament.uk:

image

Refreshing the query shown above in the Query Editor results in the following activity being shown in Fiddler:

image

There are several calls to the service root URL, but the important call is to get the top 1000 rows from the GovernmentOrganisation table:

https://api.parliament.uk/odata/GovernmentOrganisation?$top=1000

Notice the use of the $top query option to restrict the number of rows returned – the first example of the Power Query engine pushing a filter back to the data source. This only happens when you refresh the query in the Power Query Editor so you can see a sample of the data; when you click the Close and Apply button and load the query into your dataset you’ll see that this $top filter is not applied and all the rows from the table are requested.

Altering the query to filter the rows down to where the GroupName column equals “Cabinet Office” like so:

image

Results in the following M query:

let
    Source = OData.Feed("https://api.parliament.uk/odata"),
    GovernmentOrganisation_table =
	Source{[
		Name="GovernmentOrganisation",
		Signature="table"
	]}[Data],
    #"Filtered Rows" =
	Table.SelectRows(
		GovernmentOrganisation_table,
		each ([GroupName] = "Cabinet Office")
	)
in
    #"Filtered Rows"

Fiddler shows the following call that includes the filter:

image

https://api.parliament.uk/odata/GovernmentOrganisation?$filter=GroupName%20eq%20’Cabinet%20Office’&$top=1000

[The Power Query Editor does quite a lot of caching so you may need to click the Refresh button on the ribbon to make sure it actually does call the API]

It can be quite difficult to work out what’s going on in a call like this, so with the highlighted row in the screenshot above selected you can go to the Inspectors tab and then the WebForms sub-tab, and it will show the different query options used:

image

In this case you can see the $filter query option has been used to do the filter, so query folding has taken place in this case and the API is only returning the one row that the query returns:

image

However it is all too easy to do something in your query that prevents query folding from happening. For example if you add an index column to the table before filtering by GroupName, as shown in this query:

let
    Source = OData.Feed("https://api.parliament.uk/odata"),
    GovernmentOrganisation_table =
	Source{[
		Name="GovernmentOrganisation",
		Signature="table"
	]}[Data],
    #"Added Index" =
	Table.AddIndexColumn(
		GovernmentOrganisation_table,
		"Index",
		0,
		1),
    #"Filtered Rows" =
		Table.SelectRows(
			#"Added Index",
			each ([GroupName] = "Cabinet Office"))
in
    #"Filtered Rows"

 

image

…then Fiddler shows that only the $top filter is being applied in the call to the API, not the $filter on GroupName, so query folding is no longer taking place for the filter step:

image

image

Not everything you do in your M queries can or will be folded back to an OData API, but in general you should aim to get query folding to take place on the steps in your query that reduce the amount of data returned by the API the most. Row filters like the one shown above are a prime example of the type of transformation that will need to be folded in order to get the best possible data refresh performance. You may need to experiment with reordering steps and applying transformations in different ways to get the Power Query engine to call the API in the way you want.

Reordering Multiple Columns With ‘Remove Other Columns’ In The Power Query Editor

In the comments to my last blog post on how the order that you select columns can affect the output of certain calculations, both Bradley Sawler and John B. Thomas pointed out something very useful that I didn’t know about: that the order you select columns can also be used with the ‘Remove Other Columns’ functionality to reorder columns in bulk.

For example, imagine you have a table with columns called A, B, C, D, E and F. If you select the columns in the order F, E, D, A, B and C and the select ‘Remove Other Columns’, the columns are reordered in the order that you clicked them:

PQReorderCols

As you can see from the demo above, ‘Remove Other Columns’ uses the Table.SelectColumns M function behind the scenes and the order the columns are listed in that function is the order that you have clicked them in. A great trick for reordering a large number of columns quickly!

The Order You Select Columns In The Power Query Editor Can Affect The Output Of Some Transformations

Maybe this is obvious to more experienced Power Query users, but something I always point out when I’m training people up on Power Query is that the order that you select columns in the Power Query Editor window (both in Power BI Desktop and Excel) can affect the output of certain transformations. For example, say you have a table with two columns A and B that both contain numbers; if you select A first and then B, and then go to Add Column/Standard/Divide, you’ll get a new column that contains the value of the calculation A/B. However, if you select B first and then A and do the division you’ll get B/A:

PQDivide

The order that you select columns is also significant for some other types of calculation such as Percent Of and Power, and also when you do a Merge Columns.

Make Excel Reports Created With Analyze In Excel Work After Publishing To Power BI!

I think Power BI’s a great tool, but like most Power BI users I have a list of my own pet features that I would like to see implemented to make Power BI even greater. What I would most like to see addressed is the fact that, right now, Analyze In Excel only works with Excel on the desktop but not after you have published a workbook to Power BI. It seems crazy to me (and it’s very hard to explain to customers too) that Analyze In Excel lets you create reports in Excel using PivotTables and cube functions connected to a Power BI dataset, but when you publish your report to a Power BI workspace – so that the Excel workbook and the source data are both in Power BI – the reports stop working because Excel Online cannot connect back to Power BI.

There has been a post on the Power BI Ideas forum about this for some time, but recently I was talking to some guys from the Excel dev team and they told me that it’s actually something they, not the Power BI team, need to address. Therefore I created a post on the Excel UserVoice forum too:

https://excel.uservoice.com/forums/274580-excel-online/suggestions/33793252-pivottables-created-with-power-bi-using-analyze-in

Please vote for it! The more votes it gets, the more likely it is to be implemented quickly. Ken Puls managed to get a lot of votes for his idea to improve Power Query performance, and since that’s now in the process of being implemented it just goes to show that voting does influence what the Excel dev team works on.

Why is this important? In my opinion, Power BI is not a good ad-hoc data exploration tool and isn’t intended to be – its strengths lie elsewhere. However people do want to explore data stored in Power BI and an Excel PivotTable is the ideal way to do this (the Power BI matrix visual is very limited in comparison), and after you have found something interesting it’s only natural that you should want to share it. PivotTables aren’t the only thing Excel has to offer though: I’m a big fan of cube functions too, especially for creating financial reports, and Power BI has nothing remotely like them. Finally, don’t forget all those people who want to build reports in Excel because it’s Excel and that’s what they know. All in all getting this feature implemented would be a major boost for Power BI and broaden its range of capabilities.

Running M Queries In Visual Studio With The Power Query SDK

Writing M in the Advanced Editor in Excel or Power BI can be a frustrating experience unless you’re the kind of masochist who loves writing code in Notepad. There are some options for writing M code outside Excel and Power BI, for example Lars Schreiber’s M extension for Notepad++ (see here for details) or the M extension for Visual Studio Code (available from the Visual Studio Marketplace here; more details on Brett Powell’s blog here), but the trouble with them is that you have to copy the code back into Excel or Power BI to run it. What many people don’t realise, however, is that it is possible to write M code and have IntelliSense, formatting, keyword highlighting and also the ability to execute your own M queries, using the Power Query SDK in Visual Studio.

The Power Query SDK (which you can download here) supports Visual Studio 2015 and 2017 and is intended for people who are writing custom Data Connectors for Power BI. To let you test your Data Connector you can create a .pq file containing M code, and this in fact allows you to run any M query you want whether you’re building a Data Connector or not.

Here’s how. First, install the Power Query SDK and then open Visual Studio and create a new project. Find the Power Query template, select the PQ file option and give your file a name:

image

Then, in the .pq file that is created, you can enter an M query and then either press the Start button on the toolbar or hit F5 to run the query. The output of the query is shown on the Output tab in the M Query Output window:

image

Right-clicking on your project in the Solution Explorer pane and selecting Properties brings up a Properties dialog with various properties that control how your queries behave:

image

image

Many of the properties are self-explanatory, at least for anyone used to writing M in Power BI or Excel. FastCombine turns off data privacy checks. Allow Native Query lets you use M queries that contain ‘native’ queries (for example your own SQL queries if you’re using a SQL Server data source), as Cédric Charlier shows here. A few of them, such as Legacy Redirects, I have no idea about yet (I should really ask someone…). Turning on Show Engine Traces displays engine trace information in the Log tab of the M Query Output pane; turning on Show User Traces displays trace information generated by the use of the Diagnostics.Trace() function in your code in the Log tab. You can save the contents of the Log tab to a text file.

image

Error messages are displayed on the Errors tab of the M Query Output pane:

image

When you have a query that connects to an external data source, the first time you try to run your query you will be prompted to set the credentials used to connect to that data source (as you would in Power BI Desktop), and the data privacy level for the data source, on the Errors pane:

image

The query won’t actually run this first time though; you’ll need to hit Start/F5 again to see the results. If you close the project and then reopen it you will need to enter credentials again; alternatively, on the Credentials tab you can save the credentials used for a data source to a .crd file which can then be reloaded when you reopen your project. You can also edit and delete credentials on the Credentials tab.

image

If I’m honest it’s all very basic but it does the job. The main thing that I miss from writing M code in Power BI is the Query Editor UI – when I write M code there I only write about 50% of it manually, the rest I generate by clicking buttons in the UI because it’s faster. Give me the Query Editor (or the ribbonless version of it that comes with SSDT, because Visual Studio doesn’t support ribbons apparently) and I’ll be happy. Even better, give me the improved code editing functionality in the Advanced Editor in Power BI Desktop and Excel that we’ve been promised!

%d bloggers like this: