Pushing Data From Excel To Power BI Using Streaming Datasets

One Power BI feature that almost passed me by (because it was released in August while I was on holiday) was the ability to create streaming datasets in the Power BI web app and push data to them via the Power BI REST API. This blog post has the announcement:
https://powerbi.microsoft.com/en-us/blog/real-time-in-no-time-with-power-bi/
The documentation is here:
https://powerbi.microsoft.com/en-us/documentation/powerbi-service-real-time-streaming/
And Charles Sterling has an example of how to use it with Flow and PowerApps here:
https://blogs.msdn.microsoft.com/charles_sterling/2016/10/17/how-to-create-and-customize-applications-with-powerapps-with-laura-onu-webinar-1020-10am-pst/

However, when I played around with this I found there were a few things that were either confusing or not properly documented, so I thought it would be useful to give an example of how to use this functionality to automatically synch data from a table in Excel to Power BI using a Power Query query.

Creating the streaming dataset in Power BI

Imagine that you have a table called Sales in an Excel workbook on your desktop:

image

There are three columns: Month and Product, which contain text values, and Sales, which contains an integer value. This is the data that we want to push up to Power BI.

The first step is to create a streaming dataset in Power BI to receive this data. Go to PowerBI.com and on the left-hand menu, under the Datasets heading, you’ll find a link called Streaming Datasets right at the bottom next to the Get Data button:

image

Click it and you’ll go to the Streaming data screen. Click on the “Add streaming dataset” button in the top-right to create a new streaming dataset:

image

Choose the API option in the pop-out pane then click Next:

image

Then give your dataset a name, enter the names and data types for the columns in the table and leave the Historic data analysis option turned off (we’ll come back to this later):

image

Hit Create and you’ll see a screen showing the URL to use to push data to the dataset and an example of the JSON to use to send the data:

image

Copy the URL and put it somewhere handy – you’ll need it in a moment.

Pushing data to the streaming dataset from Excel

Back in your Excel workbook, open the Power Query Query Editor window and create a new text parameter called PowerBIStreamingDatasetURL and paste in the URL for the streaming dataset:

image

Next, create a new blank query and use the following M code:

[sourcecode language=”text” padlinenumbers=”true”]
let
Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
ChangedType = Table.TransformColumnTypes(
Source,
{
{"Month", type text},
{"Product", type text},
{"Sales", Int64.Type}
}),
ConvertToJson = Json.FromValue(ChangedType),
SendToPowerBI = Web.Contents(PowerBIStreamingDatasetURL,
[Content=ConvertToJson,
ManualStatusHandling={400,404}]),
GetMetadata = Value.Metadata(SendToPowerBI),
GetResponseCode = GetMetadata[Response.Status],
CurrentTime = DateTime.ToText(DateTime.FixedLocalNow()),
Output = #table({"Status"},
{{
if GetResponseCode=200 then
"Data updated successfully at " & CurrentTime
else
"Failure at " & CurrentTime}})
in
Output
[/sourcecode]

This query does the following:

  • Reads the data from the Sales table in the workbook
  • Converts the data to JSON (for some background on how it does this, see here)
  • Sends the data to the streaming dataset using Web.Contents() to make a POST request. See this post on how to make POST requests using Web.Contents() and this post on the technique I’m using to handle HTTP errors manually.
  • Returns a table containing a message saying whether the data was updated successfully or not, and the time of execution like so:image

Finally, back in Excel, go to the Data tab on the ribbon, click on the Connections button to open the Workbook Connections dialog, select the connection that represents the query you’ve just created, click Properties, then in the Connection Properties dialog tick the “Refresh every” box and set the query to refresh automatically every minute:

image

Displaying data in a Power BI dashboard

Back in the browser in Power BI, create a new dashboard, click the Add Tile button and choose the Custom Streaming Data option:

image

Click Next and select the streaming dataset created earlier:

image

Click Next again and then choose Clustered bar chart as your Visualization Type, select the Month field of the dataset for the Axis, Product for the Legend…

image

…the Sales field for the Value and set the time window to display to 1 second:

image

Frustratingly there’s no way to create a measure or otherwise aggregate data here. In this example you’re using all of the fields in dataset in the chart; if you left out Product, however, you wouldn’t see aggregated sales for all products you would just see data for one (the last?) row in the table for each month.

Finally, set a title for the chart:

image

You now have a dashboard that gets updated automatically and shows the data from the Sales table in the Excel workbook:

image

When you change the data in Excel, after the Power Query query has run in the background every minute, the new data will appear in the chart.

[Be aware that it might take a few minutes for everything to start working when you first create a new tile]

Other ways of visualising the data

There are other types of data visualisation your can use such as line charts that are all very straightforward. One thing that did confuse me was the card visual: it shows one number, but which number? In this example if you create a card and link it to the Sales field in the dataset, it will always display the value from the last row in the table:

image

Again, it would be really nice if there was a way of creating a measure here…

The Historic Data Analysis option

You may remember the Historic Data Analysis option from an earlier step. What happens if you turn it on? Basically, instead of storing just one copy of the table you push through the API it stores multiple copies of the table (although it doesn’t store everything – I guess it’s subject to this retention policy or something similar). For example, consider the following variation on the streaming dataset above:

image

There’s a new field called UpdateDateTime (which is of type text, not datetime, because I found this worked better in reports) and the Historic data analysis switch is turned on.

Here’s an updated version of the Power Query query that populates the UpdateDateTime field with the date and time that the query was run:

[sourcecode language=”text”]
let
Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
CurrentTime = DateTime.ToText(DateTime.FixedLocalNow()),
AddUpdateDateTime = Table.AddColumn(Source, "UpdateDateTime",
each "Data Update: " & CurrentTime),
ChangedType = Table.TransformColumnTypes(
AddUpdateDateTime ,
{
{"Month", type text},
{"Product", type text},
{"Sales", Int64.Type},
{"UpdateDateTime", type text}
}),
ConvertToJson = Json.FromValue(ChangedType),
SendToPowerBI = Web.Contents(PowerBIStreamingDatasetURL,
[Content=ConvertToJson,
ManualStatusHandling={400,404}]),
GetMetadata = Value.Metadata(SendToPowerBI),
GetResponseCode = GetMetadata[Response.Status],
Output = #table({"Status"},
{{
if GetResponseCode=200 then
"Data updated successfully at " & CurrentTime
else
"Failure at " & CurrentTime}})
in
Output
[/sourcecode]

You can download a demo workbook with this second example query in here.

The dashboards now work in more or less the same way. The Time Window To Display option that we set to 1 Second above can be used to control the number of copies of the pushed table that are displayed. For example, setting it to five minutes shows data from all of the copies of the table pushed in the last five minutes:

image

[Incidentally, if you’re using the Power Query query above you’ll see that every time the query runs, the web service is actually called twice! This is a feature of Power Query and M in general – there’s no guarantee that the web service will be called just once even if the query itself is executed once. This is why the dev team always tells people never to use Power Query to update data in a data source (see here for another example of this)]

You now also get a new option to create a report from a streaming dataset on the Streaming Data screen – you need to click the small graph icon next to the name of the streaming dataset:

image

image

So now you can create reports that show how the data in your Excel table has changed over time, and slice by the values in the UpdateDateTime field:

image

It’s important to realise that unlike the dashboards, reports connected to a streaming dataset don’t refresh automatically – you have to click the Refresh button on the report.

Conclusion

Although the examples in this post are perhaps not all that practical, hopefully they show what’s possible with the streaming API and some M code. It would be great if we could do data modelling-type stuff like add measures in the web interface, in the same way that we can in Power BI Desktop, because that would open the door to doing even more cool things with streaming data.

Generating JSON In Power BI And Power Query

Often, when calling web services from Power BI or Power Query, you’ll need to generate some JSON inside your query to send to these web services. The M language makes this relatively easy to do with the Json.FromValue() function but there’s only one example of how to use it in the help so I though it might be useful to provide some worked examples of M data types and how Json.FromValue() turns them into JSON.

First, here’s a function – that I’ve called GetJson() for the examples here – that takes a parameter of any data type, passes it to Json.FromValue() and returns the JSON representation of the input as text:

(InputData) =>

let
JsonOutput = Json.FromValue(InputData),
OutputText = Text.FromBinary(JsonOutput)
in
OutputText

Now, let’s see what this returns for a variety of different data types.

Tables

Take the following table on an Excel worksheet, called “Sales”:

image

The following query loads the data from this table and calls the GetJson() function defined above:

let

Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(
Source,
{{"Month", type text}, {"Sales", Int64.Type}}),
Output = GetJson(#"Changed Type")
in
Output

It returns the following JSON, an array of objects:

[

{"Month":"January","Sales":1},
{"Month":"February","Sales":2},
{"Month":"March","Sales":3}
]

Lists

M language lists are represented as JSON arrays, so the following query:

GetJson({"Apples", "Oranges", "Pears"})

…returns

["Apples","Oranges","Pears"]

And

GetJson({{1,"Apples"}, {2,"Oranges"}, {3,"Pears"}})

…returns

[[1,"Apples"],[2,"Oranges"],[3,"Pears"]]

Records

M language records are represented as JSON objects, so:

GetJson([month="January", product="Apples", sales=1])

…returns

{"month":"January","product":"Apples","sales":1}

Nested Data

Finally, in M it’s possible to have nested data structures: lists of tables, records containing lists and so on. These can be converted to JSON too, so in the following example of a record containing a text value, a list and a table (created using #table):

GetJson(

[
product="Apples",
colours={"Red","Green"},
sales=
#table({"Month", "Sales"},
{
{"January", 1},
{"February", 2}
}
)
]
)

…the JSON output is:

{

"product":"Apples",
"colours":[
"Red",
"Green"
],
"sales":[
{
"Month":"January",
"Sales":1
},
{
"Month":"February",
"Sales":2
}
]
}

You can download the sample Excel 2016 workbook with these examples in here.

Using The Invoke Custom Function Button In Power BI

There are a lot of cool new features in the September 2016 update for Power BI, so many in fact that several of the really important changes in the Query Editor are in danger of going unnoticed. In this post I want to walk through how to use the new Invoke Custom Function button in the Query Editor in Power BI and explain why it’s such a useful thing to have.

More advanced Power BI users will know that a lot of data loading patterns involve using custom M functions. In the past implementing these patterns involved learning M both for writing functions and also for invoking those functions. A few months ago Power BI introduced the ability to automatically generate functions from queries that use parameters, without needing to write code, and now with the latest update we can also invoke functions easily by clicking a button. This means that a lot more advanced data loading patterns are now available to users who don’t know any M and there’s even less need for someone like me to open the Advanced Editor window and start writing code.

Let’s take a look at how this works with a very simple example. Say you have a table that contains sales data, with the number of units sold and the price paid:

image

You now want to add a new column to this table that calculates the sales value as Units * Price, and you have a function to do this. Here’s what the M code for that function (called “Calculate Value”) could look like:

[sourcecode language=”text” padlinenumbers=”true”]
(Units as number, Price as number) => Units * Price
[/sourcecode]

image

With a query that returns your sales data and another query that returns the Calculate Value function, you can easily create a new column on the sales data query and invoke the function for each row. Go to the Sales query, go to the Add Column tab on the ribbon, and click Invoke Custom Function:

image

You’ll see the Invoke Custom Function dialog appear. Here you can choose the query that returns the function you want to use and enter the values you want to pass to that functions’ parameters. At the moment you can type in a value or pass values from a column in the table you’re invoking (strangely enough you don’t seem to be able to use Power BI parameters here yet though?):

image

Click OK and the function is invoked for every row in the table:

image

To take a more realistic example, look at this post – this new functionality replaces the step where I create a new Custom Column and invoke the GetSheet1() function that I created.

This is why Power BI is so successful: Microsoft are not only implementing high-profile wow features but also adding the obscure, unsexy features that nonetheless make a real difference to the productivity of advanced users.

Data-Driven Power BI Desktop Parameters Using List Queries

The July 2016 update for Power BI Desktop included the ability to make parameters data-driven, by giving the option to bind the Suggested Values (previously called the Allowed Values) property of a parameter to the output of a query that returns a list. However that’s pretty much all the information blog post gives you, so if you’re wondering what a list is and how to get query to return one so you can use it in a parameter then read on…

A list is one of the most useful data types in M, the language behind all of Power BI Desktop’s data-loading functionality. A list is nothing more than an ordered list of values of any data type and it’s written in M as a comma-delimited list enclosed in braces. Any query can return a list – most queries you build in the Query Editor window in Power BI Desktop will return tables but a query can in fact return a value of any data type.

Here’s an example of how to define a list containing three values in M:

[sourcecode language=”text” padlinenumbers=”true”]
{"Apples","Oranges","Pears"}
[/sourcecode]

You can create a query that returns this list in Power BI Desktop by clicking on the Get Data button and selecting the New Blank Query option:

image

…and then, in the Query Editor window, opening the Advanced Editor, deleting all the code in there and replacing it with an expression like the one above that returns a list:

image

When you click Done the Advanced Editor dialog will close and you’ll see the values in the list displayed in the Query Editor, along with the List Tools tab on the ribbon above it:

image

Although this looks table-like, it isn’t a table – don’t get confused between tables and lists!

Now you have a query that returns a list you can create a new parameter that uses it. In the Parameters dialog, create a new parameter then choose Query in the Suggested Values dropdown box and then choose the name of the query that returns the list (in this case I’ve called the query Fruit) in the Query dropdown box:

image

You now have a parameter whose suggested values are provided by the output of a query:

image

OK, so now you know what a query that returns a list looks like and how to use it in a parameter. The example above isn’t very practical though – how do you get a list of values from a real-world query? It’s actually very easy.

Imagine you have a query returning the contents of the DimDate table from the Adventure Works DW SQL Server sample database and you want to create a parameter that allows your user to select a day of the week. The EnglishDayNameOfWeek column in the DimDate table contains the day names that you need.

To get a list from a column in a table you need to click on the column to select it and then right-click and select either:

  • Drill Down
  • Add as New Query

image

Drill Down creates a new step in the current query that returns a list of all of the values in the selected table; Add as New Query gives you the same list but as a new query.

image

The last thing to do is to click the Remove Duplicates button so that the list only contains the seven distinct values:

image

Now you have a list whose values are derived from an external data source, ready to create a data-driven parameter.

One final thing to note: you may be wondering if it’s possible to create cascading parameters (like in Reporting Services) where the selection made in one parameter controls the available values for another parameter. As far as I can see this isn’t supported yet, unfortunately.

Web.Contents(), M Functions And Dataset Refresh Errors In Power BI

One slightly frustrating feature of Power BI is that some of the cool stuff you can do in M code for loading data, and which works in Power BI Desktop (and in Power Query), causes errors when you try to refresh your dataset after it has been published to PowerBI.com. I recently learned some interesting tricks for working around these problems when you are using Web.Contents() and M custom functions, which I thought deserved a wider audience and which are the subject of this post; thanks are due to Curt Hagenlocher of Microsoft and Idan Cohen for sharing this information.

First of all, I recommend you read my previous post on using the RelativePath and Query options with Web.Contents() to get some background, not just on the M functionality I’ll be using but also on the web service I’ll be using in my examples.

Let’s look at an example of where the problem occurs. The following M query uses a function to call the UK government’s open data metadata search API multiple times and then return a result into a table:

let
    Terms =
      #table(
       {"Term"},
       {{"apples"}, {"oranges"}, {"pears"}}
       ),
    SearchSuccessful = (Term) =>
    let
        Source =
         Json.Document(
           Web.Contents(
             "https://data.gov.uk/api/3/action/package_search?q="
             & Term
            )
           ),
        Success = Source[success]
    in
        Success,
    Output =
     Table.AddColumn(
       Terms,
       "Search Successful",
       each SearchSuccessful([Term])
      )
in
    Output

Here’s the output:

image

This is just a variation on the widely-used M pattern for using functions to iterate over and combine data from multiple data sources; Matt Masson has a good blog describing this pattern here. In this case I’m doing the following:

  • Defining a table using #table() with three rows containing three search terms.
  • Defining a function that calls the metadata API. It takes one parameter, a search term, and returns a value indicating whether the search was successful or not from the JSON document returned. What the API actually returns isn’t relevant here, though, just the fact that I’m calling it. Note the highlighted lines in the code above that show how I’m constructing the URL passed to Web.Contents() by simply concatenating the base URL with the string passed in via the custom function’s Term parameter.
  • Adding a custom column to the table returned by the first step, and calling the function defined in the second step using the search term given in each row.

This query refreshes with no problems in Power BI Desktop. However, when you publish a report that uses this code to PowerBI.com and try to refresh the dataset, you’ll see that refresh fails and returns a rather unhelpful error message:

Data source error Unable to refresh the model (id=1264553) because it references an unsupported data source.

image
image

The problem is that when a published dataset is refreshed, Power BI does some static analysis on the code to determine what the data sources for the dataset are and whether the supplied credentials are correct. Unfortunately in some cases, such as when the definition of a data source depends on the parameters from a custom M function, that static analysis fails and therefore the dataset does not refresh.

The good news is that when, as in this case, the data source is a call to Web.Contents() then Power BI only checks the base url passed into the first parameter during static analysis – and as my previous blog post shows, by using the RelativePath and Query options with Web.Contents() you can leave the value passed to the first parameter as a static string. Therefore, the following version of the query does refresh successfully in Power BI:

let
    Terms =
      #table(
       {"Term"},
       {{"apples"}, {"oranges"}, {"pears"}}
       ),
    SearchSuccessful = (Term) =>
    let
        Source =
         Json.Document(
           Web.Contents(
             "https://data.gov.uk/api/3/action/package_search",
             [Query=[q=Term]]
            )
           ),
        Success = Source[success]
    in
        Success,
    Output =
     Table.AddColumn(
       Terms,
       "Search Successful",
       each SearchSuccessful([Term])
      )
in
    Output

This technique will only work if the url passed to the first parameter of Web.Contents() is valid in itself, is accessible and does not return an error. But what if it isn’t? Luckily there’s another trick you can play: when you specify the Query option it can override parts of the url supplied in the first parameter. For example, take the following expression:

Web.Contents(
 "https://data.gov.uk/api/3/action/package_search?q=apples",
 [Query=[q="oranges"]]
)

When static analysis is carried out before dataset refresh, the url

https://data.gov.uk/api/3/action/package_search?q=apples

..is evaluated. However when the dataset is actually refreshed, the search term in the Query option overrides the search term in the base url, so that the call to the web service that is actually made and whose data is used by the query is:

https://data.gov.uk/api/3/action/package_search?q=oranges

This means you can specify a base url that isn’t really just a base url just so that static analysis succeeds, and then use the Query option to construct the url you really want to use.

Of course this is all a bit of a hack and I’m sure, eventually, we’ll get to the point where any M code that works in Power BI Desktop and/or Power Query works in a published report. However it doesn’t sound as though this will be happening in the near future so it’s good to know how to work around this problem. I wonder whether there are other, similar tricks you can play with functions that access data sources apart from Web.Contents()? I need to do some testing…

UPDATE: The “Skip Test Connection” option on Power BI data sources, added in April 2019, solves some of the problems you run into where you can’t use RelativePath or Query to construct the url. See https://blog.crossjoin.co.uk/2019/04/25/skip-test-connection-power-bi-refresh-failures/

UPDATE #2: since writing this post the error message you’ll get in the Power BI Service has changed to this: You can’t schedule refresh for this dataset because the following data sources currently don’t support refresh The problem and the solution are exactly the same though.

Using The RelativePath And Query Options With Web.Contents() In Power Query And Power BI M Code

The Web.Contents() function in M is the key to getting data from web pages and web services, and has a number of useful – but badly documented – options that make it easier to construct urls for your web service calls.

Consider the following url:

https://data.gov.uk/api/3/action/package_search?q=cows

It is a call to the metadata api (documentation here) for https://data.gov.uk/, the UK government’s open data portal, and returns a JSON document listing all the datasets found for a search on the keyword “cows”. You can make this call using Web.Contents() quite easily like so:

Web.Contents(
 "https://data.gov.uk/api/3/action/package_search?q=cows"
)

However, instead of having one long string for your url (which will probably need to be constructed in a separate step) you can use the RelativePath and Query options with Web.Contents(). They are given in the second parameter of the function and passed through as fields in a record. RelativePath adds some extra text to the base url given in the first parameter for the function, while Query allows you to add query parameters to the url, and is itself a record.

So, taking the example above, if the base url for the api is https://data.gov.uk/api we can use these options like so:

Web.Contents(
 "https://data.gov.uk/api",
 [
  RelativePath="3/action/package_search",
  Query=[q="cows"]
 ]
)

RelativePath is just the string “3/action/package_search” and is added to the base url. There is just one query parameter “q”, the search query, and the search term is “cows”, so Query takes a record with one field: [q=”cows”]. If you want to specify multiple query parameters you just need to add more fields to the Query record; for example:

Web.Contents(
	"https://data.gov.uk/api",
	[
		RelativePath="3/action/package_search",
		Query=
		[
			q="cows",
			rows="20"
		]
	]
)

Generates a call that returns 20 results, rather than the default 10:

https://data.gov.uk/api/3/action/package_search?q=cows&rows=20

Obviously these options make it easier to construct urls and the code is much clearer, but there are also other benefits to using these options which I’ll cover in another blog post soon.

Note: at the time of writing there is a bug that causes the value given in RelativePath to be appended twice when the Web.Page() function is also used. Hopefully this will be fixed soon.

UPDATE January 2021: This blog post has more details on using the Query option: 
https://blog.crossjoin.co.uk/2021/01/10/handling-multiple-url-query-parameters-with-the-same-name-using-web-contents-in-power-query-power-bi/

Handling 404–Not Found Errors With Web.Contents() In Power Query And Power BI

One strange feature of the Web.Contents() function in Power Query and Power BI is that it doesn’t respond in a consistent way to the standard error handling techniques used in M. I don’t know if this is a bug or a feature, but it’s certainly something I’ve run into a few times so I thought I would share a description of the problem and a way of working around it.

First of all, what’s the problem? Imagine that you wanted to import a list of training courses from that fine UK Microsoft BI and SQL Server training company Technitrain into Power Query or Power BI. You could do so using an M query that uses the Web.Contents() function to get the course RSS feed, like so:

let
Source = Web.Contents("http://technitrain.com/feed/")
in
Source

But what happens if you get the URL wrong, or there’s some other problem with the site? For example, the following URL will give a 404 – Not Found error because the page doesn’t exist:

http://technitrain.com/blahblah

If you use it in an M query, like so:

let
Source = Web.Contents("http://technitrain.com/blahblah")
in
Source

Unsurprisingly you get an error:

DataSource.Error: Web.Contents failed to get contents from ‘http://technitrain.com/blahblah’ (404): Not Found

image

The real issue is, though, when you attempt to handle this error with a try/otherwise statement like so:

let
Source = try
Web.Contents("http://technitrain.com/blahblah")
otherwise
"Error!"
in
Source

…it doesn’t work and you get the same error! What’s strange is that in some cases a try/otherwise block in more complex code will work, so for example in:

let
Source = try
Xml.Tables(
Web.Contents("http://technitrain.com/blahblah")
)
otherwise
"Error!"
in
Source

… the error does get caught:

image

This thread on the Power Query forum suggests it’s something to do with lazy evaluation, but I haven’t been able to determine the situations when it does work and when it doesn’t.

Instead, it is possible to handle specific HTTP error codes using the ManualStatusHandling option in Web.Contents():

let
Source = Web.Contents(
"http://technitrain.com/blahblah",
[ManualStatusHandling={404}])
in
Source

The ManualStatusHandling option takes a list of numeric HTTP error codes, and if you run the above example you’ll see that the query no longer returns an error.

The next problem is, then, how do you know whether the request worked or not? It turns out that you can find out by looking at the metadata associated with the Source variable (for some background on getting metadata values see this post). So, for example, using Value.Metadata() on the Source variable like so:

let
Source = Web.Contents(
"http://technitrain.com/blahblah",
[ManualStatusHandling={404}]),
GetMetadata = Value.Metadata(Source)
in
GetMetadata

Returns a record which, among other things, contains the HTTP response code:

image

Therefore you can use something like the following pattern to trap 404 errors:

let
Source = Web.Contents(
"http://technitrain.com/blahblah",
[ManualStatusHandling={404}]),
GetMetadata = Value.Metadata(Source),
GetResponseStatus = GetMetadata[Response.Status],
Output = if GetResponseStatus=404 then "Error!" else Source
in
Output

Another Way To Check Query Folding In Power BI/Power Query M Code

Following on from my last post on checking whether query folding is taking place in a Power BI/Power Query query, I’ve just stumbled on another way of doing this – by looking at metadata. I blogged about M metadata and Power Query some time ago (you might want to read the post to get some background), but it seems like metadata is used more widely than I thought in M…

Take the following query, as used in my previous blog post:

[sourcecode language=”text”]
let
Source = Sql.Database("localhost", "adventure works dw"),
dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
#"Removed Other Columns" =
Table.SelectColumns(
dbo_DimDate,
{"DateKey", "FullDateAlternateKey",
"DayNumberOfWeek"}),
#"Filtered Rows" =
Table.SelectRows(
#"Removed Other Columns",
each ([DayNumberOfWeek] = 1))
in
#"Filtered Rows"
[/sourcecode]

As shown in that post this query connects to the Adventure Works DW database in SQL Server and gets a few columns plus some filtered rows from the DimDate table, and most importantly query folding takes place for all of these transformations.

It turns out that each of the variables used as steps in the query can have metadata automatically assigned to them, some of which is relevant to query folding, and we can use the Value.Metadata() function to get this metadata. So, taking the previous query, we can add two extra steps:

[sourcecode language=”text” highlight=”13,14″ padlinenumbers=”true”]
let
Source = Sql.Database("localhost", "adventure works dw"),
dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
#"Removed Other Columns" =
Table.SelectColumns(
dbo_DimDate,
{"DateKey", "FullDateAlternateKey",
"DayNumberOfWeek"}),
#"Filtered Rows" =
Table.SelectRows(
#"Removed Other Columns",
each ([DayNumberOfWeek] = 1)),
GetMetadata = Value.Metadata(#"Filtered Rows"),
QueryFolding = GetMetadata[QueryFolding]
in
QueryFolding
[/sourcecode]

The GetMetadata step gets the metadata associated with the #”Filtered Rows” step, and that metadata comes back in the form of a record with a single field called QueryFolding. The next step gets the data from that field, which is again in the form of a record:

image

So we can see quite clearly that this step is being folded and that SQL is being generated in the background. It doesn’t seem to be possible to get the SQL query generated here though.

One last thing to note: while all of this works for SQL Server, when I tested it on an SSAS data source where query folding is also definitely taking place the metadata told me – incorrectly – that there was no query folding happening (I couldn’t see the MDX query generated using the View Native Query menu option either). Maybe this is all still a work in progress?

Value.ResourceExpression() And Query Folding In Power BI

When the June update for Power BI Desktop was released, I was extremely excited to discover an undocumented new feature that allows you to see when query folding is taking place while loading data in the Query Editor. Devin Knight beat me to blogging about it with an excellent post that I strongly recommend you to read:

https://devinknightsql.com/2016/07/03/power-bi-checking-query-folding-with-view-native-query/

As Devin says, this is very important because it’s the first time that we as developers can check to see whether query folding is taking place within the Power BI Query Editor (I guess this feature will come to Power Query soon, maybe sometime in autumn 2016), and getting query folding working is the key to getting good performance when loading data.

Looking at this more closely, I’ve found that it’s also possible to get the query generated in code as well as just seeing it in a dialog box. For example, consider the following M code that returns a few columns and some filtered rows from the DimDate table in the Adventure Works DW SQL Server database:

[sourcecode language=”text” padlinenumbers=”true”]
let
Source = Sql.Database("localhost", "adventure works dw"),
dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
#"Removed Other Columns" =
Table.SelectColumns(
dbo_DimDate,
{"DateKey", "FullDateAlternateKey",
"DayNumberOfWeek"}),
#"Filtered Rows" =
Table.SelectRows(
#"Removed Other Columns",
each ([DayNumberOfWeek] = 1))
in
#"Filtered Rows"
[/sourcecode]

image

You can confirm that query folding is taking place and see the SQL that is being generated for the Filtered Rows step by right-clicking on it in the Applied Steps pane and choosing View Native Query:

image

image

If you now add two lines of code to the query you can get the SQL query for the #”Filtered Rows” step as a text value using the Value.ResourceExpression() function, like so:

[sourcecode language=”text” highlight=”13,14,15,16″]
let
Source = Sql.Database("localhost", "adventure works dw"),
dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
#"Removed Other Columns" =
Table.SelectColumns(
dbo_DimDate,
{"DateKey", "FullDateAlternateKey",
"DayNumberOfWeek"}),
#"Filtered Rows" =
Table.SelectRows(
#"Removed Other Columns",
each ([DayNumberOfWeek] = 1)),
FindQueryRecord =
Value.ResourceExpression(#"Filtered Rows"),
Output =
FindQueryRecord[Arguments]{1}[Value]
in
Output
[/sourcecode]

image

As far as I can see, the Value.ResourceExpression() function can be used with any expression where query folding is taking place and returns a record which, if you dig deep enough into it contains the SQL query. The record itself seems to contain all the information needed to invoke the new Value.NativeQuery() function which I’ve also been looking at recently – but I won’t be blogging about just yet because I’m told it’s not quite finished.

Why is it useful to get the SQL query as text, rather than just seeing it? Well… I have some ideas, and hopefully some other upcoming blog posts will make things clearer!

Step Descriptions And Comments In M Code In Power BI Desktop

The June release of Power BI Desktop has what seems to be a fairly unremarkable new feature in that it allows you to add descriptions to each step in a query in the Query Editor window. However the implementation turns out to be a lot more interesting than you might expect: the step descriptions become comments in the M code, and even better if you write M code in the Advanced Editor window your comments appear as descriptions in the Applied Steps pane.

Take the following M query, entered in the Advanced Editor, as an example:

[sourcecode language='text'  padlinenumbers='true']
let
    Step1 = 5,
    Step2 = 10,
    Step3 = Step1 * Step2
in
    Step3
[/sourcecode]

There are three variables declared in the let expression which appear as three steps in the Applied Steps pane. The first two steps declare integers and the third multiplies these two integers together, returning 50.

If you right-click on the first step and select Properties, then you can enter a description for the step in the Properties pane that appears:

image

image

After you click OK, the description is visible as a tooltip when you mouse-over the step:

image

If you then open the Advanced Editor window you’ll see the M code for the query has now been changed to include a comment (NB comments in M code start with //):

[sourcecode language='text'  highlight='2']
let
    // Declare the first number
    Step1 = 5,
    Step2 = 10,
    Step3 = Step1 * Step2
in
    Step3
[/sourcecode]

image

If you then edit the M code in the Advanced Editor window to add a comment in the line before a step, like so:

[sourcecode language='text'  highlight='5']
let
    // Declare the first number
    Step1 = 5,
    Step2 = 10,
    // Multiple the two numbers together
    Step3 = Step1 * Step2
in
    Step3
[/sourcecode]

…then this will also show up as a description when you mouse-over the step in the Applied Steps pane:

image

As a result, for anyone like me who writes a lot of M code manually in the Advanced Editor window, this turns out to be a really handy feature.