Web.Contents(), M Functions And Dataset Refresh Errors In Power BI

One slightly frustrating feature of Power BI is that some of the cool stuff you can do in M code for loading data, and which works in Power BI Desktop (and in Power Query), causes errors when you try to refresh your dataset after it has been published to PowerBI.com. I recently learned some interesting tricks for working around these problems when you are using Web.Contents() and M custom functions, which I thought deserved a wider audience and which are the subject of this post; thanks are due to Curt Hagenlocher of Microsoft and Idan Cohen for sharing this information.

First of all, I recommend you read my previous post on using the RelativePath and Query options with Web.Contents() to get some background, not just on the M functionality I’ll be using but also on the web service I’ll be using in my examples.

Let’s look at an example of where the problem occurs. The following M query uses a function to call the UK government’s open data metadata search API multiple times and then return a result into a table:

let
    Terms = 
      #table(
       {"Term"},
       {{"apples"}, {"oranges"}, {"pears"}}
       ),
    SearchSuccessful = (Term) => 
    let
        Source = 
         Json.Document(
           Web.Contents(
             "https://data.gov.uk/api/3/action/package_search?q=" 
             & Term
            )
           ),
        Success = Source[success]
    in
        Success,
    Output = 
     Table.AddColumn(
       Terms, 
       "Search Successful", 
       each SearchSuccessful([Term])
      )
in
    Output

Here’s the output:

image

This is just a variation on the widely-used M pattern for using functions to iterate over and combine data from multiple data sources; Matt Masson has a good blog describing this pattern here. In this case I’m doing the following:

  • Defining a table using #table() with three rows containing three search terms.
  • Defining a function that calls the metadata API. It takes one parameter, a search term, and returns a value indicating whether the search was successful or not from the JSON document returned. What the API actually returns isn’t relevant here, though, just the fact that I’m calling it. Note the highlighted lines in the code above that show how I’m constructing the URL passed to Web.Contents() by simply concatenating the base URL with the string passed in via the custom function’s Term parameter.
  • Adding a custom column to the table returned by the first step, and calling the function defined in the second step using the search term given in each row.

This query refreshes with no problems in Power BI Desktop. However, when you publish a report that uses this code to PowerBI.com and try to refresh the dataset, you’ll see that refresh fails and returns a rather unhelpful error message:

Data source error Unable to refresh the model (id=1264553) because it references an unsupported data source.

image

image

The problem is that when a published dataset is refreshed, Power BI does some static analysis on the code to determine what the data sources for the dataset are and whether the supplied credentials are correct. Unfortunately in some cases, such as when the definition of a data source depends on the parameters from a custom M function, that static analysis fails and therefore the dataset does not refresh.

The good news is that when, as in this case, the data source is a call to Web.Contents() then Power BI only checks the base url passed into the first parameter during static analysis – and as my previous blog post shows, by using the RelativePath and Query options with Web.Contents() you can leave the value passed to the first parameter as a static string. Therefore, the following version of the query does refresh successfully in Power BI:

let
    Terms = 
      #table(
       {"Term"},
       {{"apples"}, {"oranges"}, {"pears"}}
       ),
    SearchSuccessful = (Term) => 
    let
        Source = 
         Json.Document(
           Web.Contents(
             "https://data.gov.uk/api/3/action/package_search", 
             [Query=[q=Term]]
            )
           ),
        Success = Source[success]
    in
        Success,
    Output = 
     Table.AddColumn(
       Terms, 
       "Search Successful", 
       each SearchSuccessful([Term])
      )
in
    Output

This technique will only work if the url passed to the first parameter of Web.Contents() is valid in itself, is accessible and does not return an error. But what if it isn’t? Luckily there’s another trick you can play: when you specify the Query option it can override parts of the url supplied in the first parameter. For example, take the following expression:

Web.Contents(
 "https://data.gov.uk/api/3/action/package_search?q=apples", 
 [Query=[q="oranges"]]
)

When static analysis is carried out before dataset refresh, the url

https://data.gov.uk/api/3/action/package_search?q=apples

..is evaluated. However when the dataset is actually refreshed, the search term in the Query option overrides the search term in the base url, so that the call to the web service that is actually made and whose data is used by the query is:

https://data.gov.uk/api/3/action/package_search?q=oranges

This means you can specify a base url that isn’t really just a base url just so that static analysis succeeds, and then use the Query option to construct the url you really want to use.

Of course this is all a bit of a hack and I’m sure, eventually, we’ll get to the point where any M code that works in Power BI Desktop and/or Power Query works in a published report. However it doesn’t sound as though this will be happening in the near future so it’s good to know how to work around this problem. I wonder whether there are other, similar tricks you can play with functions that access data sources apart from Web.Contents()? I need to do some testing…

Using The RelativePath And Query Options With Web.Contents() In Power Query And Power BI M Code

The Web.Contents() function in M is the key to getting data from web pages and web services, and has a number of useful – but badly documented – options that make it easier to construct urls for your web service calls.

Consider the following url:

https://data.gov.uk/api/3/action/package_search?q=cows

It is a call to the metadata api (documentation here) for https://data.gov.uk/, the UK government’s open data portal, and returns a JSON document listing all the datasets found for a search on the keyword “cows”. You can make this call using Web.Contents() quite easily like so:

Web.Contents(
 "https://data.gov.uk/api/3/action/package_search?q=cows"
)

However, instead of having one long string for your url (which will probably need to be constructed in a separate step) you can use the RelativePath and Query options with Web.Contents(). They are given in the second parameter of the function and passed through as fields in a record. RelativePath adds some extra text to the base url given in the first parameter for the function, while Query allows you to add query parameters to the url, and is itself a record.

So, taking the example above, if the base url for the api is https://data.gov.uk/api we can use these options like so:

Web.Contents(
 "https://data.gov.uk/api", 
 [
  RelativePath="3/action/package_search", 
  Query=[q="cows"]
 ]
)

RelativePath is just the string “3/action/package_search” and is added to the base url. There is just one query parameter “q”, the search query, and the search term is “cows”, so Query takes a record with one field: [q=”cows”]. If you want to specify multiple query parameters you just need to add more fields to the Query record; for example:

Web.Contents(
	"https://data.gov.uk/api", 
	[
		RelativePath="3/action/package_search", 
		Query=
		[
			q="cows", 
			rows="20"
		]
	]
)

Generates a call that returns 20 results, rather than the default 10:

https://data.gov.uk/api/3/action/package_search?q=cows&rows=20

Obviously these options make it easier to construct urls and the code is much clearer, but there are also other benefits to using these options which I’ll cover in another blog post soon.

Note: at the time of writing there is a bug that causes the value given in RelativePath to be appended twice when the Web.Page() function is also used. Hopefully this will be fixed soon.

Handling 404–Not Found Errors With Web.Contents() In Power Query And Power BI

One strange feature of the Web.Contents() function in Power Query and Power BI is that it doesn’t respond in a consistent way to the standard error handling techniques used in M. I don’t know if this is a bug or a feature, but it’s certainly something I’ve run into a few times so I thought I would share a description of the problem and a way of working around it.

First of all, what’s the problem? Imagine that you wanted to import a list of training courses from that fine UK Microsoft BI and SQL Server training company Technitrain into Power Query or Power BI. You could do so using an M query that uses the Web.Contents() function to get the course RSS feed, like so:

let
    Source = Web.Contents("http://technitrain.com/feed/")
in
    Source

 

But what happens if you get the URL wrong, or there’s some other problem with the site? For example, the following URL will give a 404 – Not Found error because the page doesn’t exist:

http://technitrain.com/blahblah

If you use it in an M query, like so:

let
    Source = Web.Contents("http://technitrain.com/blahblah")
in
    Source

Unsurprisingly you get an error:

DataSource.Error: Web.Contents failed to get contents from ‘http://technitrain.com/blahblah’ (404): Not Found

image

The real issue is, though, when you attempt to handle this error with a try/otherwise statement like so:

let
    Source = try 
                Web.Contents("http://technitrain.com/blahblah") 
             otherwise 
                "Error!"
in
    Source

…it doesn’t work and you get the same error! What’s strange is that in some cases a try/otherwise block in more complex code will work, so for example in:

let
    Source = try 
                Xml.Tables(
                 Web.Contents("http://technitrain.com/blahblah")
                )
             otherwise 
                "Error!"
in
    Source

… the error does get caught:

image

This thread on the Power Query forum suggests it’s something to do with lazy evaluation, but I haven’t been able to determine the situations when it does work and when it doesn’t.

Instead, it is possible to handle specific HTTP error codes using the ManualStatusHandling option in Web.Contents():

let
    Source = Web.Contents(
    	"http://technitrain.com/blahblah",
    	[ManualStatusHandling={404}])
in
    Source

The ManualStatusHandling option takes a list of numeric HTTP error codes, and if you run the above example you’ll see that the query no longer returns an error.

The next problem is, then, how do you know whether the request worked or not? It turns out that you can find out by looking at the metadata associated with the Source variable (for some background on getting metadata values see this post). So, for example, using Value.Metadata() on the Source variable like so:

let
    Source = Web.Contents(
    	"http://technitrain.com/blahblah",
    	[ManualStatusHandling={404}]),
    GetMetadata = Value.Metadata(Source)
in
    GetMetadata

Returns a record which, among other things, contains the HTTP response code:

image

Therefore you can use something like the following pattern to trap 404 errors:

let
    Source = Web.Contents(
        "http://technitrain.com/blahblah",
        [ManualStatusHandling={404}]),
    GetMetadata = Value.Metadata(Source),
    GetResponseStatus = GetMetadata[Response.Status],
    Output = if GetResponseStatus=404 then "Error!" else Source
in
    Output

Another Way To Check Query Folding In Power BI/Power Query M Code

Following on from my last post on checking whether query folding is taking place in a Power BI/Power Query query, I’ve just stumbled on another way of doing this – by looking at metadata. I blogged about M metadata and Power Query some time ago (you might want to read the post to get some background), but it seems like metadata is used more widely than I thought in M…

Take the following query, as used in my previous blog post:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = 
    Table.SelectColumns(
        dbo_DimDate,
        {"DateKey", "FullDateAlternateKey", 
        "DayNumberOfWeek"}),
    #"Filtered Rows" = 
    Table.SelectRows(
        #"Removed Other Columns", 
        each ([DayNumberOfWeek] = 1))
in
    #"Filtered Rows"

As shown in that post this query connects to the Adventure Works DW database in SQL Server and gets a few columns plus some filtered rows from the DimDate table, and most importantly query folding takes place for all of these transformations.

It turns out that each of the variables used as steps in the query can have metadata automatically assigned to them, some of which is relevant to query folding, and we can use the Value.Metadata() function to get this metadata. So, taking the previous query, we can add two extra steps:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = 
    Table.SelectColumns(
        dbo_DimDate,
        {"DateKey", "FullDateAlternateKey", 
        "DayNumberOfWeek"}),
    #"Filtered Rows" = 
    Table.SelectRows(
        #"Removed Other Columns", 
        each ([DayNumberOfWeek] = 1)),
    GetMetadata = Value.Metadata(#"Filtered Rows"),
    QueryFolding = GetMetadata[QueryFolding]
in
    QueryFolding

The GetMetadata step gets the metadata associated with the #”Filtered Rows” step, and that metadata comes back in the form of a record with a single field called QueryFolding. The next step gets the data from that field, which is again in the form of a record:

image

So we can see quite clearly that this step is being folded and that SQL is being generated in the background. It doesn’t seem to be possible to get the SQL query generated here though.

One last thing to note: while all of this works for SQL Server, when I tested it on an SSAS data source where query folding is also definitely taking place the metadata told me – incorrectly – that there was no query folding happening (I couldn’t see the MDX query generated using the View Native Query menu option either). Maybe this is all still a work in progress?

Value.ResourceExpression() And Query Folding In Power BI

When the June update for Power BI Desktop was released, I was extremely excited to discover an undocumented new feature that allows you to see when query folding is taking place while loading data in the Query Editor. Devin Knight beat me to blogging about it with an excellent post that I strongly recommend you to read:

https://devinknightsql.com/2016/07/03/power-bi-checking-query-folding-with-view-native-query/

As Devin says, this is very important because it’s the first time that we as developers can check to see whether query folding is taking place within the Power BI Query Editor (I guess this feature will come to Power Query soon, maybe sometime in autumn 2016), and getting query folding working is the key to getting good performance when loading data.

Looking at this more closely, I’ve found that it’s also possible to get the query generated in code as well as just seeing it in a dialog box. For example, consider the following M code that returns a few columns and some filtered rows from the DimDate table in the Adventure Works DW SQL Server database:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = 
	Table.SelectColumns(
		dbo_DimDate,
		{"DateKey", "FullDateAlternateKey", 
		"DayNumberOfWeek"}),
    #"Filtered Rows" = 
	Table.SelectRows(
		#"Removed Other Columns", 
		each ([DayNumberOfWeek] = 1))
in
    #"Filtered Rows"

image

You can confirm that query folding is taking place and see the SQL that is being generated for the Filtered Rows step by right-clicking on it in the Applied Steps pane and choosing View Native Query:

image

image

If you now add two lines of code to the query you can get the SQL query for the #”Filtered Rows” step as a text value using the Value.ResourceExpression() function, like so:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = 
	Table.SelectColumns(
		dbo_DimDate,
		{"DateKey", "FullDateAlternateKey", 
		"DayNumberOfWeek"}),
    #"Filtered Rows" = 
	Table.SelectRows(
		#"Removed Other Columns", 
		each ([DayNumberOfWeek] = 1)),
    FindQueryRecord = 
        Value.ResourceExpression(#"Filtered Rows"),
    Output = 
        FindQueryRecord[Arguments]{1}[Value]
in
    Output

image

As far as I can see, the Value.ResourceExpression() function can be used with any expression where query folding is taking place and returns a record which, if you dig deep enough into it contains the SQL query. The record itself seems to contain all the information needed to invoke the new Value.NativeQuery() function which I’ve also been looking at recently – but I won’t be blogging about just yet because I’m told it’s not quite finished.

Why is it useful to get the SQL query as text, rather than just seeing it? Well… I have some ideas, and hopefully some other upcoming blog posts will make things clearer!

The Ellipsis/Not Implemented Error In M

Recently I was creating a parameter in Power BI Desktop and had it configured something like this:

image

I didn’t bother to choose anything in the Default Value dropdown box, and when I looked at the code for the parameter in the Advanced Editor I saw this:

"a" 
meta 
[IsParameterQuery=true, 
 List={"a", "b", "c"}, 
 DefaultValue=..., 
 Type="Text", 
 IsParameterQueryRequired=true]

I was interested to know what the ellipsis symbol (three dots …) in the DefaultValue field in the record meant, and looking in the Language Reference I found that in M it can be used for two purposes: as an open record marker (maybe something for a future blog post) and, as in this case, as a quick way of returning an error. The language reference says that it is directly equivalent to the following expression, which returns a “Not Implemented” error:

error Error.Record("Expression.Error", "Not Implemented")

But from what I can see, it actually returns a “Value was not specified” error instead. Here’s another example of how it can be used in a function definition, and what it returns:

let
    MyFunction = (x,y) => if x>y then true else ...,
    Output = MyFunction(0,1)
in
    Output

image

It’s not something I think I’ll be using in my own code, but it’s good to know what it means!

Creating Sequences Of Integers And Characters In Power BI/Power Query Lists

When you’re writing your own M code you often find yourself needing to create a list containing a sequence of numbers or characters. Luckily the M language allows you to do this very easily when you are defining lists by using expressions in the format

{lowest_integer..highest_integer}

For example, imagine you want to create a list with all of the integers between 1 and 5. Instead of writing

{1, 2, 3, 4, 5}

You can write the following:

{1..5}

and it will return the same list:

image

You can also use this format in more complex list definitions, for example

{1..3, 5, 7..9}

Returns the list

{1, 2, 3, 5, 7, 8, 9}

image

When you’re using this technique you must always put the lowest integer first and the highest integer last; if you don’t do this you get an empty list. So, for example, the expression

{5..1}

Returns an empty list:

image

It’s also possible to use this technique to create lists of characters. For example, the expression:

{"a".."z"}

Returns a list containing all of the lowercase letters of the alphabet:

image

The first character in the expression has to have the lowest Unicode value and the second character has to have the highest Unicode value, and the sequence of characters returned is the list of all characters with Unicode values in that range. As a result, the expression

{"#".."%"}

Returns the list

{"#", "$", "%"}

image

And the expression

{"a".."Z"}

Returns an empty list because the Unicode value of “a” is greater than the Unicode value of “Z”.

This technique doesn’t work for decimal numbers, dates or other data types. If you want a sequence of values of these types you need to use functions list List.Dates() and List.Numbers().

Lists are, of course, used all over the place in M. Building on my recent post on using #table() to create tables with no data source, here’s one last example of using lists containing sequences to create a simple table with three columns and three rows:

#table({"A".."C"}, {{1..3}, {7..9}, {11..13}})

image