Defining Variables In DAX Queries

Variables are the best thing to happen to DAX since, well forever – they are so cool I’m almost ready to like DAX as much as I like MDX. There are already several good articles and blog posts out there describing how to use them (see here and here), but I was looking at a Profiler trace the other day and saw something I hadn’t yet realised about them: you can declare and use variables in the DEFINE clause of a DAX query. Since my series of posts on DAX queries still gets a fair amount of traffic, I thought it would be worth writing a brief post showing how this works.

Say you have the following table (called Sales) in your model:

image

You can declare DAX variables in the DEFINE clause of a query like so:

[sourcecode language=”text” padlinenumbers=”true”]
DEFINE
VAR MyMonth = "January"
VAR FilteredMonths =
FILTER ( VALUES ( Sales[Month] ), Sales[Month] = MyMonth )
EVALUATE
CALCULATETABLE ( Sales, FilteredMonths )
[/sourcecode]

This query returns the following result:

image

The benefits of using variables in this way are the same as you get when using variables in measures and calculated columns: improved readability, less repetition and potential performance improvements.

I also wondered whether I would be able to refer to these variables inside measures declared in the DEFINE clause, but unfortunately you can’t. The following query:

[sourcecode language=”text” padlinenumbers=”true”]
DEFINE
VAR MyMonth = "January"
VAR FilteredMonths =
FILTER ( VALUES ( Sales[Month] ), Sales[Month] = MyMonth )
MEASURE Sales[FirstMeasure] =
COUNTROWS ( FilteredMonths )
EVALUATE
ROW ( "First Measure", [FirstMeasure] )
[/sourcecode]

…returns the error

“Failed to resolve name ‘FilteredMonths’. It is not a valid table, variable or function name”.

However if you define your calculations inside the query itself, for example using the Summarize() or AddColumns() functions, or like so:

[sourcecode language=”text”]
DEFINE
VAR MyMonth = "January"
VAR FilteredMonths =
FILTER ( VALUES ( Sales[Month] ), Sales[Month] = MyMonth )
EVALUATE
ROW (
"First Calc", COUNTROWS (FilteredMonths),
"Second Calc", CALCULATE (SUM(Sales[Sales]), FilteredMonths)
)
[/sourcecode]

…the query works:

image

In a lot of cases, multiple calculations contain some of the same logic and being able to use variables to share tables and values between calculations opens up some really interesting opportunities for performance optimisations.

Generating JSON In Power BI And Power Query

Often, when calling web services from Power BI or Power Query, you’ll need to generate some JSON inside your query to send to these web services. The M language makes this relatively easy to do with the Json.FromValue() function but there’s only one example of how to use it in the help so I though it might be useful to provide some worked examples of M data types and how Json.FromValue() turns them into JSON.

First, here’s a function – that I’ve called GetJson() for the examples here – that takes a parameter of any data type, passes it to Json.FromValue() and returns the JSON representation of the input as text:

(InputData) =>

let
JsonOutput = Json.FromValue(InputData),
OutputText = Text.FromBinary(JsonOutput)
in
OutputText

Now, let’s see what this returns for a variety of different data types.

Tables

Take the following table on an Excel worksheet, called “Sales”:

image

The following query loads the data from this table and calls the GetJson() function defined above:

let

Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(
Source,
{{"Month", type text}, {"Sales", Int64.Type}}),
Output = GetJson(#"Changed Type")
in
Output

It returns the following JSON, an array of objects:

[

{"Month":"January","Sales":1},
{"Month":"February","Sales":2},
{"Month":"March","Sales":3}
]

Lists

M language lists are represented as JSON arrays, so the following query:

GetJson({"Apples", "Oranges", "Pears"})

…returns

["Apples","Oranges","Pears"]

And

GetJson({{1,"Apples"}, {2,"Oranges"}, {3,"Pears"}})

…returns

[[1,"Apples"],[2,"Oranges"],[3,"Pears"]]

Records

M language records are represented as JSON objects, so:

GetJson([month="January", product="Apples", sales=1])

…returns

{"month":"January","product":"Apples","sales":1}

Nested Data

Finally, in M it’s possible to have nested data structures: lists of tables, records containing lists and so on. These can be converted to JSON too, so in the following example of a record containing a text value, a list and a table (created using #table):

GetJson(

[
product="Apples",
colours={"Red","Green"},
sales=
#table({"Month", "Sales"},
{
{"January", 1},
{"February", 2}
}
)
]
)

…the JSON output is:

{

"product":"Apples",
"colours":[
"Red",
"Green"
],
"sales":[
{
"Month":"January",
"Sales":1
},
{
"Month":"February",
"Sales":2
}
]
}

You can download the sample Excel 2016 workbook with these examples in here.

Using The Invoke Custom Function Button In Power BI

There are a lot of cool new features in the September 2016 update for Power BI, so many in fact that several of the really important changes in the Query Editor are in danger of going unnoticed. In this post I want to walk through how to use the new Invoke Custom Function button in the Query Editor in Power BI and explain why it’s such a useful thing to have.

More advanced Power BI users will know that a lot of data loading patterns involve using custom M functions. In the past implementing these patterns involved learning M both for writing functions and also for invoking those functions. A few months ago Power BI introduced the ability to automatically generate functions from queries that use parameters, without needing to write code, and now with the latest update we can also invoke functions easily by clicking a button. This means that a lot more advanced data loading patterns are now available to users who don’t know any M and there’s even less need for someone like me to open the Advanced Editor window and start writing code.

Let’s take a look at how this works with a very simple example. Say you have a table that contains sales data, with the number of units sold and the price paid:

image

You now want to add a new column to this table that calculates the sales value as Units * Price, and you have a function to do this. Here’s what the M code for that function (called “Calculate Value”) could look like:

[sourcecode language=”text” padlinenumbers=”true”]
(Units as number, Price as number) => Units * Price
[/sourcecode]

image

With a query that returns your sales data and another query that returns the Calculate Value function, you can easily create a new column on the sales data query and invoke the function for each row. Go to the Sales query, go to the Add Column tab on the ribbon, and click Invoke Custom Function:

image

You’ll see the Invoke Custom Function dialog appear. Here you can choose the query that returns the function you want to use and enter the values you want to pass to that functions’ parameters. At the moment you can type in a value or pass values from a column in the table you’re invoking (strangely enough you don’t seem to be able to use Power BI parameters here yet though?):

image

Click OK and the function is invoked for every row in the table:

image

To take a more realistic example, look at this post – this new functionality replaces the step where I create a new Custom Column and invoke the GetSheet1() function that I created.

This is why Power BI is so successful: Microsoft are not only implementing high-profile wow features but also adding the obscure, unsexy features that nonetheless make a real difference to the productivity of advanced users.

SSAS Multidimensional Caching-Related Performance Problems With Power BI Tables

Last week I was doing some performance tuning for a customer using Power BI on top of a large SSAS Multidimensional cube via a Live connection. Some of their reports were performing particularly badly and I uncovered a problem with the DAX generated by Power BI for tables with a large number of rows, fields from two or more large hierarchies and totals turned on.

The problem is very easy to reproduce; I’m going to do it using a simplified version of the Adventure Works cube that contains only the Date and Product dimensions. Take a look at the following table from a Power BI report:

image

It has the Date attribute from the Date dimension, and the Product attribute from the Product dimension, on rows and two measures on columns. The table has a large number of rows in it (both Date and Product are fairly large hierarchies) and if you look at the DAX generated by Power BI you can see that it only requests the first 501 rows. That’s fine – the real problem is that Power BI also generates a second DAX query to get the two values displayed in the Total line at the bottom of the table. The DAX looks something like this, and is easily identifiable because it uses the Row() function:

[sourcecode language="text" padlinenumbers="true"]
EVALUATE
  CALCULATETABLE(
    ROW(
      "Sales_Amount", 'Internet Sales Facts'[Sales Amount],
      "Tax_Amt", 'Internet Sales Facts'[Tax Amt]
    ),
    KEEPFILTERS(
      GENERATE(
        KEEPFILTERS(VALUES('Product'[Product.Key0])),
        CALCULATETABLE(
          FILTER(
            KEEPFILTERS(VALUES('Order Date'[Date.Key0])),
            OR(
              NOT(ISBLANK('Internet Sales Facts'[Sales Amount])),
              NOT(ISBLANK('Internet Sales Facts'[Tax Amt]))
            )
          )
        )
      )
    )
  )
[/sourcecode]

This query has something in it – I don’t know what – that means that it cannot make use of the Analysis Services Storage Engine cache. Every time you run it SSAS will go to disk, read the data that it needs and then aggregate it, which means you’ll get cold-cache performance all the time. On a big cube this can be a big problem. This is very similar to problems I’ve seen with MDX queries on Multidimensional and which I blogged about here; it’s the first time I’ve seen this happen with a DAX query though. I suspect a lot of people using Power BI on SSAS Multidimensional will have this problem without realising it.

This problem does not occur for all tables – as far as I can see it only happens with tables that have a large number of rows and two or more hierarchies in. The easy way to check whether you have this problem is to refresh your report, run a Profiler trace that includes the Progress Report Begin/End and Query Subcube Verbose events (and any others you find useful) and then refresh the report again by pressing the Refresh button in Power BI Desktop without changing it at all. In your trace, if you see any of the Progress Report events appear when that second refresh happens, as well as Query Subcube Verbose events with an Event Subclass of Non-cache data, then you know that the Storage Engine cache is not being used.

image

Also look at the Duration column in the trace for these events which shows the time in milliseconds that they took to execute. This will tell you how much of an impact this problem is having on your report refresh times.

The easy way to stop this happening is to turn off the totals row in the table:

image

Displaying the totals in a separate table also seems to avoid the problem, although of course it doesn’t look as good. Only putting one hierarchy in the table apart from your measures, also seems to solve the problem. You could also try all the usual methods to improve SSAS performance such as building aggregations on the cube.

The dev team is aware of this problem but it’s unlikely to be fixed in the short term.

UPDATE 8th September 2020: This problem has finally been fixed as part of the general SuperDAX performance improvements in Analysis Services 2019 CU5. See https://powerbi.microsoft.com/en-us/blog/improving-power-bi-performance-when-querying-multidimensional-models/ for more details.

Loading Data From Multiple Excel Workbooks Into Power BI–And Making Sure Data Refresh Works After Publishing

I can hear you yawning already – yet another blog post on getting data from multiple Excel workbooks in Power Query and Power BI. Just about everyone who has ever written a blog post on Power BI has written about this subject, including me. However there’s a twist this time: what if your Excel workbooks are stored in SharePoint or OneDrive For Business? If they are, then your dataset may not refresh successfully after you have published unless you load your data in a particular way.

Some background first. A few weeks ago I was contacted by a reader who had seen my post on data refresh errors and the Web.Contents() function and was experiencing the same issue when using Web.Contents() to get data from multiple Excel workbooks stored in SharePoint. Up until recently the Web.Contents() function – which is used by the From Web option in Power BI’s Get Data experience – was the only way to get data reliably from single Excel files stored in SharePoint or OneDrive For Business. However the limitations on Web.Contents(), M functions and data refresh described in my blog post meant that it wasn’t possible to use Web.Contents() to get data from multiple Excel files stored in SharePoint or OneDrive For Business.

The April 2016 Power BI Desktop update introduced a new way of getting data from Excel files stored in SharePoint: the SharePoint Files data source, based on the SharePoint.Files() M function. Both Mike Carlo and Ken Puls have already blogged about this in depth and so I won’t repeat what they’ve written; their posts have a lot of good information on how to construct the URLs to find your files in SharePoint. How do you use it to combine data from multiple Excel workbooks though?

Imagine you have four identically-structure Excel workbooks stored in a folder in OneDrive For Business:

image

Each one has a table called Table1 with some sales data in it:

image

In Power BI Desktop, create a new query and click the Get Data button. In the Get Data dialog, go to Files and click on SharePoint Folder:

image

Enter the URL for your OneDrive For Business site:

image

You’ll then see all the files in all your folders in OneDrive For Business:

image

Filter the folders in the Folder Path column so you only see the files in the folder containing your Excel workbooks:

image

Right-click on the Content column and select Remove Other Columns to get rid of all but the Content column. Then click the Add Custom Column button and add the following expression:

[sourcecode language='text' ]
Excel.Workbook([Content])
[/sourcecode]

image

This tells Power BI to treat each file in this folder as an Excel Workbook. Click OK, and then right-click on the Content column and select Remove (you won’t need this any more). Next, click on the Expand icon in the top right-hand corner of the Custom column and click OK on the flyout:

image

image

This will give you a table listing all of the contents of each workbook:

image

Filter this table so you only have the four tables from each workbook:

Next, right-click on the Data column and select Remove Other Columns, then finally click OK on the Expand icon again:

image

You’ll now have a table containing all of the data from the tables called Table1 in each workbook in the folder:

image

Don’t forget to set appropriate data types for each column (columns will have the data type Any by default, which will be treated as text later on)! You can now build your report and publish it:

image

In order for refresh to work, you’ll need to configure the credentials used by the Power BI service to connect to your data. In the browser, click on the ellipses for the Dataset for your report and select Schedule Refresh:

image

This will take you to the Datasets tab on the Settings page. You can schedule refresh here if you want, but the important thing is to click the Edit Credentials link:

image

If like me you have an Office 365 subscription and use SharePoint Online, then the dialog choose oAuth2 in the Authentication Method dropdown box and click Sign In:

image

You’ll see the Windows Organizational Account sign-in page appear briefly while you are signed in but you shouldn’t need to do anything. And that’s it!

You can now edit the data in any of your Excel workbooks and, once the dataset has refreshed, those changes will show up in the report. If you want to do a manual refresh of the data, clicking the Refresh button at the top of the report won’t do any good; you have to click on the Refresh Now option on the dataset (seen in the screenshot above, just below the Schedule Refresh option).

Data-Driven Power BI Desktop Parameters Using List Queries

The July 2016 update for Power BI Desktop included the ability to make parameters data-driven, by giving the option to bind the Suggested Values (previously called the Allowed Values) property of a parameter to the output of a query that returns a list. However that’s pretty much all the information blog post gives you, so if you’re wondering what a list is and how to get query to return one so you can use it in a parameter then read on…

A list is one of the most useful data types in M, the language behind all of Power BI Desktop’s data-loading functionality. A list is nothing more than an ordered list of values of any data type and it’s written in M as a comma-delimited list enclosed in braces. Any query can return a list – most queries you build in the Query Editor window in Power BI Desktop will return tables but a query can in fact return a value of any data type.

Here’s an example of how to define a list containing three values in M:

[sourcecode language=”text” padlinenumbers=”true”]
{"Apples","Oranges","Pears"}
[/sourcecode]

You can create a query that returns this list in Power BI Desktop by clicking on the Get Data button and selecting the New Blank Query option:

image

…and then, in the Query Editor window, opening the Advanced Editor, deleting all the code in there and replacing it with an expression like the one above that returns a list:

image

When you click Done the Advanced Editor dialog will close and you’ll see the values in the list displayed in the Query Editor, along with the List Tools tab on the ribbon above it:

image

Although this looks table-like, it isn’t a table – don’t get confused between tables and lists!

Now you have a query that returns a list you can create a new parameter that uses it. In the Parameters dialog, create a new parameter then choose Query in the Suggested Values dropdown box and then choose the name of the query that returns the list (in this case I’ve called the query Fruit) in the Query dropdown box:

image

You now have a parameter whose suggested values are provided by the output of a query:

image

OK, so now you know what a query that returns a list looks like and how to use it in a parameter. The example above isn’t very practical though – how do you get a list of values from a real-world query? It’s actually very easy.

Imagine you have a query returning the contents of the DimDate table from the Adventure Works DW SQL Server sample database and you want to create a parameter that allows your user to select a day of the week. The EnglishDayNameOfWeek column in the DimDate table contains the day names that you need.

To get a list from a column in a table you need to click on the column to select it and then right-click and select either:

  • Drill Down
  • Add as New Query

image

Drill Down creates a new step in the current query that returns a list of all of the values in the selected table; Add as New Query gives you the same list but as a new query.

image

The last thing to do is to click the Remove Duplicates button so that the list only contains the seven distinct values:

image

Now you have a list whose values are derived from an external data source, ready to create a data-driven parameter.

One final thing to note: you may be wondering if it’s possible to create cascading parameters (like in Reporting Services) where the selection made in one parameter controls the available values for another parameter. As far as I can see this isn’t supported yet, unfortunately.

Web.Contents(), M Functions And Dataset Refresh Errors In Power BI

One slightly frustrating feature of Power BI is that some of the cool stuff you can do in M code for loading data, and which works in Power BI Desktop (and in Power Query), causes errors when you try to refresh your dataset after it has been published to PowerBI.com. I recently learned some interesting tricks for working around these problems when you are using Web.Contents() and M custom functions, which I thought deserved a wider audience and which are the subject of this post; thanks are due to Curt Hagenlocher of Microsoft and Idan Cohen for sharing this information.

First of all, I recommend you read my previous post on using the RelativePath and Query options with Web.Contents() to get some background, not just on the M functionality I’ll be using but also on the web service I’ll be using in my examples.

Let’s look at an example of where the problem occurs. The following M query uses a function to call the UK government’s open data metadata search API multiple times and then return a result into a table:

let
    Terms =
      #table(
       {"Term"},
       {{"apples"}, {"oranges"}, {"pears"}}
       ),
    SearchSuccessful = (Term) =>
    let
        Source =
         Json.Document(
           Web.Contents(
             "https://data.gov.uk/api/3/action/package_search?q="
             & Term
            )
           ),
        Success = Source[success]
    in
        Success,
    Output =
     Table.AddColumn(
       Terms,
       "Search Successful",
       each SearchSuccessful([Term])
      )
in
    Output

Here’s the output:

image

This is just a variation on the widely-used M pattern for using functions to iterate over and combine data from multiple data sources; Matt Masson has a good blog describing this pattern here. In this case I’m doing the following:

  • Defining a table using #table() with three rows containing three search terms.
  • Defining a function that calls the metadata API. It takes one parameter, a search term, and returns a value indicating whether the search was successful or not from the JSON document returned. What the API actually returns isn’t relevant here, though, just the fact that I’m calling it. Note the highlighted lines in the code above that show how I’m constructing the URL passed to Web.Contents() by simply concatenating the base URL with the string passed in via the custom function’s Term parameter.
  • Adding a custom column to the table returned by the first step, and calling the function defined in the second step using the search term given in each row.

This query refreshes with no problems in Power BI Desktop. However, when you publish a report that uses this code to PowerBI.com and try to refresh the dataset, you’ll see that refresh fails and returns a rather unhelpful error message:

Data source error Unable to refresh the model (id=1264553) because it references an unsupported data source.

image
image

The problem is that when a published dataset is refreshed, Power BI does some static analysis on the code to determine what the data sources for the dataset are and whether the supplied credentials are correct. Unfortunately in some cases, such as when the definition of a data source depends on the parameters from a custom M function, that static analysis fails and therefore the dataset does not refresh.

The good news is that when, as in this case, the data source is a call to Web.Contents() then Power BI only checks the base url passed into the first parameter during static analysis – and as my previous blog post shows, by using the RelativePath and Query options with Web.Contents() you can leave the value passed to the first parameter as a static string. Therefore, the following version of the query does refresh successfully in Power BI:

let
    Terms =
      #table(
       {"Term"},
       {{"apples"}, {"oranges"}, {"pears"}}
       ),
    SearchSuccessful = (Term) =>
    let
        Source =
         Json.Document(
           Web.Contents(
             "https://data.gov.uk/api/3/action/package_search",
             [Query=[q=Term]]
            )
           ),
        Success = Source[success]
    in
        Success,
    Output =
     Table.AddColumn(
       Terms,
       "Search Successful",
       each SearchSuccessful([Term])
      )
in
    Output

This technique will only work if the url passed to the first parameter of Web.Contents() is valid in itself, is accessible and does not return an error. But what if it isn’t? Luckily there’s another trick you can play: when you specify the Query option it can override parts of the url supplied in the first parameter. For example, take the following expression:

Web.Contents(
 "https://data.gov.uk/api/3/action/package_search?q=apples",
 [Query=[q="oranges"]]
)

When static analysis is carried out before dataset refresh, the url

https://data.gov.uk/api/3/action/package_search?q=apples

..is evaluated. However when the dataset is actually refreshed, the search term in the Query option overrides the search term in the base url, so that the call to the web service that is actually made and whose data is used by the query is:

https://data.gov.uk/api/3/action/package_search?q=oranges

This means you can specify a base url that isn’t really just a base url just so that static analysis succeeds, and then use the Query option to construct the url you really want to use.

Of course this is all a bit of a hack and I’m sure, eventually, we’ll get to the point where any M code that works in Power BI Desktop and/or Power Query works in a published report. However it doesn’t sound as though this will be happening in the near future so it’s good to know how to work around this problem. I wonder whether there are other, similar tricks you can play with functions that access data sources apart from Web.Contents()? I need to do some testing…

UPDATE: The “Skip Test Connection” option on Power BI data sources, added in April 2019, solves some of the problems you run into where you can’t use RelativePath or Query to construct the url. See https://blog.crossjoin.co.uk/2019/04/25/skip-test-connection-power-bi-refresh-failures/

UPDATE #2: since writing this post the error message you’ll get in the Power BI Service has changed to this: You can’t schedule refresh for this dataset because the following data sources currently don’t support refresh The problem and the solution are exactly the same though.

Using The RelativePath And Query Options With Web.Contents() In Power Query And Power BI M Code

The Web.Contents() function in M is the key to getting data from web pages and web services, and has a number of useful – but badly documented – options that make it easier to construct urls for your web service calls.

Consider the following url:

https://data.gov.uk/api/3/action/package_search?q=cows

It is a call to the metadata api (documentation here) for https://data.gov.uk/, the UK government’s open data portal, and returns a JSON document listing all the datasets found for a search on the keyword “cows”. You can make this call using Web.Contents() quite easily like so:

Web.Contents(
 "https://data.gov.uk/api/3/action/package_search?q=cows"
)

However, instead of having one long string for your url (which will probably need to be constructed in a separate step) you can use the RelativePath and Query options with Web.Contents(). They are given in the second parameter of the function and passed through as fields in a record. RelativePath adds some extra text to the base url given in the first parameter for the function, while Query allows you to add query parameters to the url, and is itself a record.

So, taking the example above, if the base url for the api is https://data.gov.uk/api we can use these options like so:

Web.Contents(
 "https://data.gov.uk/api",
 [
  RelativePath="3/action/package_search",
  Query=[q="cows"]
 ]
)

RelativePath is just the string “3/action/package_search” and is added to the base url. There is just one query parameter “q”, the search query, and the search term is “cows”, so Query takes a record with one field: [q=”cows”]. If you want to specify multiple query parameters you just need to add more fields to the Query record; for example:

Web.Contents(
	"https://data.gov.uk/api",
	[
		RelativePath="3/action/package_search",
		Query=
		[
			q="cows",
			rows="20"
		]
	]
)

Generates a call that returns 20 results, rather than the default 10:

https://data.gov.uk/api/3/action/package_search?q=cows&rows=20

Obviously these options make it easier to construct urls and the code is much clearer, but there are also other benefits to using these options which I’ll cover in another blog post soon.

Note: at the time of writing there is a bug that causes the value given in RelativePath to be appended twice when the Web.Page() function is also used. Hopefully this will be fixed soon.

UPDATE January 2021: This blog post has more details on using the Query option: 
https://blog.crossjoin.co.uk/2021/01/10/handling-multiple-url-query-parameters-with-the-same-name-using-web-contents-in-power-query-power-bi/

Handling 404–Not Found Errors With Web.Contents() In Power Query And Power BI

One strange feature of the Web.Contents() function in Power Query and Power BI is that it doesn’t respond in a consistent way to the standard error handling techniques used in M. I don’t know if this is a bug or a feature, but it’s certainly something I’ve run into a few times so I thought I would share a description of the problem and a way of working around it.

First of all, what’s the problem? Imagine that you wanted to import a list of training courses from that fine UK Microsoft BI and SQL Server training company Technitrain into Power Query or Power BI. You could do so using an M query that uses the Web.Contents() function to get the course RSS feed, like so:

let
Source = Web.Contents("http://technitrain.com/feed/")
in
Source

But what happens if you get the URL wrong, or there’s some other problem with the site? For example, the following URL will give a 404 – Not Found error because the page doesn’t exist:

http://technitrain.com/blahblah

If you use it in an M query, like so:

let
Source = Web.Contents("http://technitrain.com/blahblah")
in
Source

Unsurprisingly you get an error:

DataSource.Error: Web.Contents failed to get contents from ‘http://technitrain.com/blahblah’ (404): Not Found

image

The real issue is, though, when you attempt to handle this error with a try/otherwise statement like so:

let
Source = try
Web.Contents("http://technitrain.com/blahblah")
otherwise
"Error!"
in
Source

…it doesn’t work and you get the same error! What’s strange is that in some cases a try/otherwise block in more complex code will work, so for example in:

let
Source = try
Xml.Tables(
Web.Contents("http://technitrain.com/blahblah")
)
otherwise
"Error!"
in
Source

… the error does get caught:

image

This thread on the Power Query forum suggests it’s something to do with lazy evaluation, but I haven’t been able to determine the situations when it does work and when it doesn’t.

Instead, it is possible to handle specific HTTP error codes using the ManualStatusHandling option in Web.Contents():

let
Source = Web.Contents(
"http://technitrain.com/blahblah",
[ManualStatusHandling={404}])
in
Source

The ManualStatusHandling option takes a list of numeric HTTP error codes, and if you run the above example you’ll see that the query no longer returns an error.

The next problem is, then, how do you know whether the request worked or not? It turns out that you can find out by looking at the metadata associated with the Source variable (for some background on getting metadata values see this post). So, for example, using Value.Metadata() on the Source variable like so:

let
Source = Web.Contents(
"http://technitrain.com/blahblah",
[ManualStatusHandling={404}]),
GetMetadata = Value.Metadata(Source)
in
GetMetadata

Returns a record which, among other things, contains the HTTP response code:

image

Therefore you can use something like the following pattern to trap 404 errors:

let
Source = Web.Contents(
"http://technitrain.com/blahblah",
[ManualStatusHandling={404}]),
GetMetadata = Value.Metadata(Source),
GetResponseStatus = GetMetadata[Response.Status],
Output = if GetResponseStatus=404 then "Error!" else Source
in
Output

Another Way To Check Query Folding In Power BI/Power Query M Code

Following on from my last post on checking whether query folding is taking place in a Power BI/Power Query query, I’ve just stumbled on another way of doing this – by looking at metadata. I blogged about M metadata and Power Query some time ago (you might want to read the post to get some background), but it seems like metadata is used more widely than I thought in M…

Take the following query, as used in my previous blog post:

[sourcecode language=”text”]
let
Source = Sql.Database("localhost", "adventure works dw"),
dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
#"Removed Other Columns" =
Table.SelectColumns(
dbo_DimDate,
{"DateKey", "FullDateAlternateKey",
"DayNumberOfWeek"}),
#"Filtered Rows" =
Table.SelectRows(
#"Removed Other Columns",
each ([DayNumberOfWeek] = 1))
in
#"Filtered Rows"
[/sourcecode]

As shown in that post this query connects to the Adventure Works DW database in SQL Server and gets a few columns plus some filtered rows from the DimDate table, and most importantly query folding takes place for all of these transformations.

It turns out that each of the variables used as steps in the query can have metadata automatically assigned to them, some of which is relevant to query folding, and we can use the Value.Metadata() function to get this metadata. So, taking the previous query, we can add two extra steps:

[sourcecode language=”text” highlight=”13,14″ padlinenumbers=”true”]
let
Source = Sql.Database("localhost", "adventure works dw"),
dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
#"Removed Other Columns" =
Table.SelectColumns(
dbo_DimDate,
{"DateKey", "FullDateAlternateKey",
"DayNumberOfWeek"}),
#"Filtered Rows" =
Table.SelectRows(
#"Removed Other Columns",
each ([DayNumberOfWeek] = 1)),
GetMetadata = Value.Metadata(#"Filtered Rows"),
QueryFolding = GetMetadata[QueryFolding]
in
QueryFolding
[/sourcecode]

The GetMetadata step gets the metadata associated with the #”Filtered Rows” step, and that metadata comes back in the form of a record with a single field called QueryFolding. The next step gets the data from that field, which is again in the form of a record:

image

So we can see quite clearly that this step is being folded and that SQL is being generated in the background. It doesn’t seem to be possible to get the SQL query generated here though.

One last thing to note: while all of this works for SQL Server, when I tested it on an SSAS data source where query folding is also definitely taking place the metadata told me – incorrectly – that there was no query folding happening (I couldn’t see the MDX query generated using the View Native Query menu option either). Maybe this is all still a work in progress?