Dynamic What-If With User-Entered Data In Power BI

My recent post on passing any value to a Power BI dynamic M parameter from a report generated quite a lot of interest but only among people who use DirectQuery mode and who want more flexibility when filtering. In this post, however, I’ll show you an application for this technique that is useful for anyone, regardless of whether they are using DirectQuery mode or Import mode, and which is nothing to do with filtering.

You’re probably familiar with Power BI’s what-if parameter feature which allows end users to pass values into calculations. It’s great but it has one big limitation: all the values that can be used need to be hard-coded in a calculated table at design time. What if (haha) you could let your users enter the values they wanted to use in the report instead? Let’s see a simple example of how to do this.

Here’s an Import mode fact table containing sales values for various products:

The aim is to allow end users to enter one or more percentage values in a report and have these used to calculate growth forecasts on the Sales Amount value.

To do this you need to use the technique from the blog post referenced above to let the users enter the percentage values. First, create an M parameter of type Decimal Number (called EnteredValueParam) in the Power Query Editor:

Then create a dummy query (in this case called EnteredValues) with no rows and a single numeric column to bind the parameter to. Here’s the M code for this:

#table(
  type table [EnteredValues = number],
  {}
)

Now comes the fun part. Rather than use these entered values to filter a SQL query, they will be used to generate a SQL statement that returns the same values in the form of a disconnected, DirectQuery table connected to a relational database (SQL Server in this case). Here’s the M code for a query, in this example called ReturnedValues, that does this:

let
  //Check if the M parameter returns a list                                         
  //If it doesn't, turn it into a list                                    
  FilterList =
    if Type.Is(
      Value.Type(EnteredValueParam),
      List.Type
    )
    then
      EnteredValueParam
    else
      {EnteredValueParam},
  //Generate all the individual SELECT statements                                               
  GenerateSQLQueries = List.Transform(
    FilterList,
    each "SELECT '"
      & Text.From(_)
      & "%' AS Forecast, "
      & Text.From(_)
      & " AS Pct"
  ),
  //Concatenate all the SELECT statements with UNION ALL into a single statement                                                                              
  UnionAll = Text.Combine(
    GenerateSQLQueries,
    " UNION ALL "
  ),
  //Connect to the SQL Server database and run the query                                                      
  Source = Sql.Database(
    "localhost",
    "AdventureWorksDW2017"
  ),
  RunQuery = Value.NativeQuery(
    Source,
    UnionAll
  )
in
  RunQuery

If the parameter EnteredValueParam is passed the single value 1, the ReturnedValue query generates the following SQL query and runs it:

SELECT '1%' AS Forecast, 1 AS Pct

Here’s what the ReturnedValue query returns:

If the parameter EnteredValueParam returns the list {1,25,50,75}, the ReturnedValue query generates the following SQL query and runs it:

SELECT '1%' AS Forecast, 1 AS Pct 
UNION ALL 
SELECT '25%' AS Forecast, 25 AS Pct 
UNION ALL 
SELECT '50%' AS Forecast, 50 AS Pct 
UNION ALL 
SELECT '75%' AS Forecast, 75 AS Pct

Here’s what the ReturnedValue query returns in this case:

At this point, with all the queries loaded, the dataset looks like this:

The ReturnedValues table is in DirectQuery mode; the Sales table is in Import mode; the Entered Values table is in Import mode and the EnteredValues column on that table is bound to the EnteredValueParam M parameter, which has the Multi-select property turned on.

With all this in place you can create a measure on the Sales table which takes the sum of Sales Amount and applies a percentage growth based on the selected value in the Pct column of the ReturnedValues table:

Sales Amount Forecasts =
SUM ( Sales[SalesAmount] )
    * (
        ( SELECTEDVALUE ( ReturnedValues[Pct] ) / 100 ) + 1
    )

Finally, here’s a report page that uses the Filter By List custom visual to allow end users to enter the percentage forecast values and has a matrix to display the measure above with the Product field from the Sales table on rows and the Forecast field from the ReturnedValues table on columns:

Here it is in action:

It’s a bit of a pain that you need a DirectQuery data source in order to be able to do this but it’s not too difficult to set one up and it won’t need much in the way of resources. I think this technique (which I hereby name “Dynamic DirectQuery Disconnected Dimension Tables”) could be useful in other scenarios too, such as ABC analysis. If you can think of other ways to use this let me know by leaving a comment below!

Avoiding The “Maximum Allowed Size” Error In Power BI DirectQuery Mode With Aggregations On Degenerate Dimensions

Back in December I wrote a post explaining why you may encounter the “The resultset of a query to external data source has exceeded the maximum allowed size” error when working with degenerate dimensions in DirectQuery mode in Power BI. In that post I showed an example of how a moderately complex DAX expression in a measure can trigger the error even when you’re applying a filter in a visual; in this post I’ll show how you can use aggregations to avoid that error.

A quick recap: say you have a dataset built from the ContosoRetailDW sample database with a Date dimension table and a fact table called FactOnlineSales with more than 12 million rows in it.

There are two measures:

Sales Amount = SUM(FactOnlineSales[SalesAmount] )

Error Demo = 
var s = [Sales Amount]
return if(s>0, s)

If you build a table visual with the SalesOrderNumber column (a degenerate dimension from the fact table with the same granularity as the fact table) on rows and just the [Sales Amount] measure in, and you filter to a single date, then everything works ok:

…but if you add the [Error Demo] measure too a SQL query is generated to get all the values from the SalesOrderNumber column which returns more than a million rows and triggers the error:

For more details please see the original post.

Since the error is caused by a DirectQuery query to get all the values from the SalesOrderNumber column, one workaround is to build an Import mode aggregation table that contains all the distinct values from that column.

It was easy to do this in my case just using Power Query – I just duplicated the FactOnlineSales query, removed all columns apart from SalesOrderNumber, and did a “Remove Duplicates” transform. This left me with a new table containing just one column, SalesOrderNumber, that I could load into my dataset using Import mode:

I then configured this new table as an aggregation table for the FactOnlineSales table, with the SalesOrderNumber column from the new table set as a GroupBy on the SalesOrderNumber column from FactOnlineSales:

With this aggregation table configured, the table with both measures in no longer gives an error:

A Profiler trace shows that the query to get the data needed for the table is still in DirectQuery mode, but the “dimension query” to get all the distinct values from SalesOrderNumber now hits the Import mode aggregation:

I know what you’re going to say though: “If I’m using DirectQuery for my fact table I don’t want to build Import mode aggregations!”. Well yes, there are some limitations to point out with this approach. In my opinion it will work well if you are using DirectQuery mode because you have very large fact tables – even a few billion rows – but your data doesn’t change very frequently (say only once a day). In that scenario refreshing an aggregation table containing just a single column could be very fast and take up a relatively small amount of memory in Power BI, at least in comparison with an Import mode table containing all the columns from the fact table. Using incremental refresh on the aggregation table will also help but unfortunately you can’t use a hybrid table as an aggregation table at the time of writing this post, so you can’t mix Import mode and DirectQuery for the aggregation table. On the other hand if you’re using DirectQuery because your data changes frequently during the day then I don’t think this approach will work because it will be impossible to keep the contents of your Import mode aggregation table in sync with the contents of your DirectQuery fact table.

The “Block length does not match with its complement” Error In Power Query In Power BI And Excel

When using the Web.Contents function to call a web service in Power Query in either Power BI or Excel you may encounter the following error:

DataFormat.Error: Block length does not match with its complement.

Here’s an example of an M query that calls a web service and, at the time of writing, gives this error:

Web.Contents(
  "https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.41&hourly=temperature_2m"
)

The problem is caused by something to do with how some web services handle “deflate” compression; apparently the same issue exists in the equivalent native .NET functionality. There are two ways to avoid it though.

First of all, the easy way: ask the web service to use “gzip” compression instead. You can do this by setting the “Accept-Encoding” header to “gzip” like so:

Web.Contents(
  "https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.41&hourly=temperature_2m",
  [
    Headers = [
      #"Accept-Encoding" = "gzip"
    ]
  ]
)

Having done this the error will no longer occur and you can go on to handle the data returned by the web service (which is a JSON document) as normal.

The second is to use the Web.BrowserContents function instead of Web.Contents:

Web.BrowserContents(
  "https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.41&hourly=temperature_2m"
)

This is more complicated because the Web.BrowserContents function returns the HTML for a web page as viewed by a web browser. Apart from issues like handling authentication you’ll need to parse the result to get the data you need:

It’s doable using the Html.Table function though. Here’s an example of how to handle the response for this particular web service call:

let
  Source = Web.BrowserContents(
    "https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.41&hourly=temperature_2m"
  ),
  HTMLTable = Html.Table(
    Source,
    {{"JSONPayload", "pre"}}
  ),
  JSON = Json.Document(
    HTMLTable{0}[JSONPayload]
  )
in
  JSON

[Thanks to Thais Marshall for bringing me this problem, and to Curt Hagenlocher for explaining and it showing me the first workaround]

Passing Any Arbitrary Value From A Power BI Report To A Dynamic M Parameter

Dynamic M parameters are incredibly useful in Power BI when using DirectQuery mode because they allow you to pass values from your report direct to your data source query, bypassing Power BI’s own SQL (or whatever query language your source uses) generation layer. However it isn’t obvious how to pass any value you want: dynamic M parameters must be bound to slicers or filters, and they must be bound to tables in your dataset, which means that at first glance it seems like you can only pass values that are already present somewhere in your dataset (for example in a dimension table) to a dynamic M parameter. This isn’t true though: there is a way to allow passing of any value your users enter to a dynamic M parameter and in this post I’ll show you how.

First of all, why is this useful? There are two reasons why you might want to allow passing of any value to a dynamic M parameter rather than binding to a slicer or filter in the normal way:

  1. Displaying the list of possible values in a slicer or filter can be expensive, especially if the column you’re slicing on contains thousands or millions of values. It can slow down your report and lead to extra queries being run on your DirectQuery source, which can lead to performance problems.
  2. Sometimes the values you want to pass in don’t exist in your DirectQuery source. The scenarios where this is useful are very similar to the scenarios where you’d want to use what-if parameters, but the big limitation of what-if parameters is that you have to pre-calculate all the values that a user might ever want to select and store them in a table. What happens if you don’t want to, or can’t, pre-calculate all the input values?

Let’s see an example of how you can pass any value you want to a dynamic M parameter.

The AdventureWorksDW2017 sample SQL Server database has a table called FactInternetSales with a column called SalesOrderNumber, and let’s say you want to build a Power BI report where an end user can enter whatever SalesOrderNumbers they want and filter the table by them. Let’s also assume that we have to use dynamic M parameters to do this efficiently (which isn’t the case with AdventureWorksDW2017 but could well be in the real world). Here’s what the data looks like:

To set things up in the Power Query Editor you need an M parameter to hold the values entered by the end user. In this example the parameter is called SalesOrderNumber:

Note that the Current Value property is set to a value that is not a valid sales order number, so when no input has been received from the users then no rows will be returned by the report.

Next you need a query that filters the FactInternetSales table by this parameter. Here’s the M code:

let
  Source = Sql.Database(
    "localhost",
    "AdventureWorksDW2017"
  ),
  dbo_FactInternetSales = Source
    {
      [
        Schema = "dbo",
        Item   = "FactInternetSales"
      ]
    }
    [Data],
  #"Removed Other Columns"
    = Table.SelectColumns(
    dbo_FactInternetSales,
    {
      "OrderDateKey",
      "SalesOrderNumber",
      "SalesOrderLineNumber",
      "SalesAmount"
    }
  ),
  FilterList =
    if Type.Is(
      Value.Type(SalesOrderNumber),
      List.Type
    )
    then
      SalesOrderNumber
    else
      {SalesOrderNumber},
  #"Filtered Rows" = Table.SelectRows(
    #"Removed Other Columns",
    each List.Contains(
      FilterList,
      [SalesOrderNumber]
    )
  )
in
  #"Filtered Rows"

There are two interesting things to notice here:

  • This code handles the cases where a user enters a single value, in which case the SalesOrderNumber M parameter will be of type text, or when the user enters multiple values in which case the SalesOrderNumber M parameter will be of type list. For more details on handling multi-select in dynamic M parameters see this post.
  • The actual filtering is done using the List.Contains M function, which does fold on SQL Server-related data sources. If you’re using other sources you should check if query folding happens for List.Contains for your source.

Thirdly, you need a dummy dimension table with a single column for the dynamic M parameter to be bound to in the report. The dimension table shouldn’t contain any data; here’s the M code to use (the query is called DimSalesOrderNumber):

let
  Source = #table(
    type table [SalesOrderNumber = text],
    {}
  )
in
  Source

This query returns a table with a single text column called SalesOrderNumber and no rows:

Once you’ve left the Power Query Editor the next thing to do is to bind the SalesOrderNumber M parameter to the SalesOrderNumber column of the DimSalesOrderNumber table:

Note that the Multi-select property has been enabled. Binding the dynamic M parameter to a table with no rows in means there’s no way a regular slicer could be used with it, because there are no values in the table for the slicer to display.

The last problem to solve is the important one: how do you allow end users to enter any value they want? There are two ways I know of. One is to use the filter pane and the “is” filter type under “Advanced filter”:

The filter pane is great because it’s built-in but it only allows the user to enter one or two (if they use the “Or” option) values to filter on. Remember also that not all filter types are available when you’re using dynamic M parameters.

A better approach, if you’re ok with using custom visuals, is to use the Filter By List custom visual which allows the end user to enter – or even copy/paste in – a list of values:

Top 10 Reasons Why Power BI People Should Attend SQLBits

SQLBits is happening again this March at the Celtic Manor resort in Wales. It’s the largest in-person Microsoft data technology conference in Europe and probably now the world; it’s also a lot of fun and something I would never miss. I imagine that a lot of you reading this post feel the same way that I do about SQLBits, but it occurred to me that if you’re relatively new in the world of Power BI you might not have heard of it and might not know why it’s such a great event to attend. Here are ten reasons why you, as a Power BI person, should go:

  1. Don’t be put off by the fact it has “SQL” in the name, this conference has lots of Power BI content. Take a look at the list of training days and conference sessions, and check out the (free!) recordings of sessions from previous events here.
  2. You could spend the whole event just going to Power BI sessions but there are also tons of sessions on related technologies like SQL Server, Synapse and Azure Databricks; there are also sessions on non-technical topics such as public speaking and how to make it as an independent consultant.
  3. Pretty much everyone who is everyone in the world of Power BI will be there. Want to see Alberto Ferrari, or Adam and Patrick from Guy in a Cube, in the flesh? Then come!
  4. It’s ridiculously cheap because it’s a community-run event and no-one is making a profit. In fact even if you live in the USA it may be cheaper to get a flight to the UK and come than attend a similar conference nearer home. People come from all over Europe (especially Scandinavia) to attend.
  5. No training budget? It’s 100% free to attend on Saturday 18th March – all you have to do is register and you get a full day of conference sessions without paying a penny.
  6. Can’t travel? Even if you can’t make it in person there’s a virtual option so you can watch sessions online. You really have to be there in person to get the full experience though.
  7. Apart from the learning opportunities, it’s a great place to network. I know lots of people who have found new jobs and advanced their careers from connections made at SQLBits – me included.
  8. Are you the only Power BI person at your organisation? Feeling lonely? Not sure that you’re doing things properly? You’ll meet people who do the same job as you so you can compare notes and complain about your users.
  9. It’s also a great place to meet companies that work in the Power BI space like Tabular Editor and Power BI Sentinel and check out their products and services.
  10. Most of all you’ll enjoy yourself. I know that might be hard to believe, especially if you’ve been to other tech conferences, but SQLBits is different: there’s entertainment every night, the Friday party is legendary and the infamous Power BI Power Hour is the most fun you’ll ever have with your favourite BI tool. It’s also a very friendly, welcoming and inclusive event.

Hopefully this as convinced you! If you’d like to find out more about SQLBits and how it’s run, I can recommend this recent interview with Simon Sabin on my colleague Kasper de Jonge’s YouTube channel.

Understanding The “A cyclic reference was encountered during evaluation” Error In Power Query In Power BI Or Excel

When working in the Power Query Editor in Power BI or Excel, especially when you’re editing M code, you may run into the following error:

Expression.Error: A cyclic reference was encountered during evaluation.

What does this message mean and what causes this error? In this blog post I’ll explain.

The first step to understanding what’s going on here is to understand how Power Query evaluates the steps in a query. I wrote a blog post on let expressions in M some time ago, which you can read here, and which goes into a lot of detail on this subject but the basics are quite easy to understand. Conside the following M query:

let
    Step1 = 1,
    Step2 = Step1 + 5,
    Step3 = Step2*2
in
    Step3

If you paste this into a new blank query in the Power Query Editor you’ll see it results in three steps in the Applied Steps on the right hand side of the screen:

The query itself, as you would expect, returns the value 12:

The important thing is to know that Power Query does not evaluate the steps in the order that they appear here – so it does not evaluate Step1 first, then Step2 and then Step3. What it does is evaluate Step3 first, which is needed for the output of the query, and then because it needs the value of Step2 to do this it evaluates Step2 next, and then because it needs the value of Step1 to evaluate Step2 it evaluates Step1 last. You can visualise the chain of dependencies between these steps like so:

Now consider the following M query:

let
    Step1 = Step3,
    Step2 = Step1 + 5,
    Step3 = Step2*2
in
    Step3

The change is that Step1 is now set to the value of Step3 instead of the value 1. This query gives you the “A cyclic reference was encountered during evaluation” error and you can probably see why: the query returns the value of Step3, which needs the value of Step2, which needs the value of Step1, which needs the value of Step3 – so there’s no way of calculating the output of the query. There’s a loop in the dependencies between the steps:

This is the cyclic reference (sometimes known as a “circular reference”) from the error message. As you can imagine it’s very easy to create cyclic references when you’re editing M code manually and changing step names.

It’s also possible to run into this error when using referenced queries, that’s to say when one Power Query query uses the output of another Power Query query as its data source. For example, let’s say you had a query called Query1 with the following M code:

let
    Source = Query2
in
    Source

…and a query called Query2 with the following M code:

let
    Source = Query1
in
    Source

Once again there is a cyclic reference: Query1 refers to the output of Query2 and Query2 refers to the output of Query1. The error message returned for Query2 is:

An error occurred in the ‘Query1’ query. Expression.Error: A cyclic reference was encountered during evaluation.

An extra problem in this case is that if you try to delete just one of these queries, you’ll get the following error message:

The query “Query1” cannot be deleted because it’s being referenced by another query: “Query2”.

You have to select both queries and then delete them, or of course edit one of the queries to break the cyclic reference.

One last thing to mention: recursion is possible in M, meaning that you can write a custom function that calls itself (Miguel Escobar has a detailed post on this here) but it can be very hard to get it working and it’s not always the most efficient way to solve a problem.

Using OData.Feed And The Query Option To Avoid The Dynamic Data Sources Error In Power BI

In my last post I showed how, in many cases, you can avoid the “dynamic data sources” error with OData data sources by taking advantage of query folding. That’s not always possible though and in this post I’ll show you how you can use the Query option of the OData.Feed function to do so instead.

At first glance the Query option of OData.Feed looks very much like the Query option on Web.Contents which, of course, can also be used to avoid the dynamic data sources error (see here and here) and that’s true up to a point: you can use it to add query parameters to your OData URL. However the documentation is not particularly detailed and there is one thing that will confuse you when you try to use it: you can’t use it with OData system query options like $filter. For example, let’s say you wanted to query the People entity in the TripPin sample OData endpoint to get only the people whose first name was Scott. You can do this as follows:

let
  Source = OData.Feed(
    "https://services.odata.org/TripPinRESTierService/People?$filter=FirstName eq 'Scott'",
    null,
    [Implementation = "2.0"]
  )
in
  Source

Since $filter is a url query parameter, you might then try the following code:

let
  Source = OData.Feed(
    "https://services.odata.org/TripPinRESTierService/People",
    null,
    [
      Implementation = "2.0",
      Query = [
        #"$filter"
          = "FirstName eq 'Scott'"
      ]
    ]
  )
in
  Source

Unfortunately, this returns the following error:

Expression.Error: OData.Feed custom query options cannot start with ‘$’.

This is Power Query’s way of telling you you can’t use OData system query options with the OData.Feed Query option, since they always start with a $ sign. This is a deliberate design decision on the part of the Power Query team; I won’t go into the reasons why it is this way, it’s complicated!

However, you can use the Query option with OData custom query options like so:

let
  Source = OData.Feed(
    "https://services.odata.org/TripPinRESTierService/People",
    null,
    [
      Implementation = "2.0",
      Query = [#"debug-mode" = "true"]
    ]
  )
in
  Source

More importantly, you can use the Query option with OData parameter aliases. The original URL we were looking at:

https://services.odata.org/TripPinRESTierService/People?$filter=FirstName eq 'Scott'

…can be rewritten to use parameter aliases like so:

https://services.odata.org/TripPinRESTierService/People?$filter=FirstName eq @fn&@fn='Scott'

…and the rewritten version can be called using the Query option like so:

let
  Source = OData.Feed(
    "https://services.odata.org/TripPinRESTierService/People?$filter=FirstName eq @f", 
    null, 
    [
      Implementation = "2.0", 
      Query = [#"@f" = "'Scott'"]
    ]
  )
in
  Source

You can use parameter aliases to call OData unbound functions too. For example, you can call TripPin’s GetNearestAirport function like so:

let
  Source = OData.Feed(
    "https://services.odata.org/TripPinRESTierService/GetNearestAirport(lat = @latitude, lon = @longitude)", 
    null, 
    [
      Implementation = "2.0", 
      Query = [
        #"@latitude"  = "100", 
        #"@longitude" = "0"
      ]
    ]
  ), 
  #"Converted to Table" = Record.ToTable(
    Source
  )
in
  #"Converted to Table"

Finally, here’s an example of a query that starts with a table of latitudes and longitudes and calls the GetNearestAirport function for each row:

let
  Source = #table(
    type table [
      LocationLatitude = text,
      LocationLongitude = text
    ],
    {
      {"10", "10"},
      {"33", "-118"},
      {"100", "-100"}
    }
  ),
  #"Added Custom" = Table.AddColumn(
    Source,
    "FunctionCall",
    each OData.Feed(
      "https://services.odata.org/TripPinRESTierService/GetNearestAirport(lat = @latitude, lon = @longitude)",
      null,
      [
        Implementation = "2.0",
        Query = [
          #"@latitude" = [
            LocationLatitude
          ],
          #"@longitude" = [
            LocationLongitude
          ]
        ]
      ]
    )
  ),
  #"Expanded FunctionCall"
    = Table.ExpandRecordColumn(
    #"Added Custom",
    "FunctionCall",
    {"Name", "IcaoCode", "IataCode"},
    {
      "FunctionCall.Name",
      "FunctionCall.IcaoCode",
      "FunctionCall.IataCode"
    }
  )
in
  #"Expanded FunctionCall"

When you first publish a dataset with this query in to the Power BI Service you’ll see an internal server error message coming from the TripPin OData service. You can make the dataset refresh successfully and avoid this and the dynamic data sources error though: you need to check the Skip Test Connection box in the credentials dialog you can open from the Settings pane and set the data privacy level on the data source that is sending data to the OData function appropriately too:

OData.Feed And The Dynamic Data Sources Error In Power BI

I’ve blogged about the “dynamic data sources” error and the Web.Contents function several times (most recently here), main post here, but never about the fact that you can encounter the same error when working with OData data sources and the OData.Feed function. More importantly, while it seems like the solution to the problem is the same for both functions this is not the case! In fact, the solution may be simpler than you might think.

Let’s take a simple example. Say you have a table with three first names in and you want to use those names to filter the People enity in the TripPin OData sample service. Knowing a bit about how OData works, you might come up with some M code like this:

let
  Source = #table(
    type table [FirstName = text],
    {{"Scott"}, {"Angel"}, {"Ursula"}}
  ),
  #"Added Custom" = Table.AddColumn(
    Source,
    "Person",
    each OData.Feed(
      "https://services.odata.org/TripPinRESTierService/People?$filter=FirstName eq '"
        & [FirstName]
        & "'",
      null,
      [Implementation = "2.0"]
    )
  ),
  #"Expanded Person"
    = Table.ExpandTableColumn(
    #"Added Custom",
    "Person",
    {
      "UserName",
      "FirstName",
      "LastName"
    },
    {
      "Person.UserName",
      "Person.FirstName",
      "Person.LastName"
    }
  )
in
  #"Expanded Person"

This code uses the $filter system query option to filter the People entity by the three names provided; it’s not the most efficient code but it works fine in Power BI Desktop. Here’s the output:

However if you publish this to the Power BI Service you’ll get the following error in the dataset Settings page:

You can’t schedule refresh for this dataset because the following data sources currently don’t support refresh

Looking at the docs you’ll see that the OData.Feed function has a Query option like the Web.Contents function and assume that the same solution that works for Web.Contents will work for OData.Feed and it can… but it’s not straightforward and I’ll need to write another post to explain why. (Update: that post is here)

Instead, all you need to do is a simple Merge operation between the first names table and the People table to get the desired result:

let
  Source = #table(
    type table [FirstName = text],
    {{"Scott"}, {"Angel"}, {"Ursula"}}
  ),
  #"Merged Queries" = Table.NestedJoin(
    Source,
    {"FirstName"},
    People,
    {"FirstName"},
    "People",
    JoinKind.Inner
  ),
  #"Expanded People"
    = Table.ExpandTableColumn(
    #"Merged Queries",
    "People",
    {
      "UserName",
      "FirstName",
      "LastName"
    },
    {
      "People.UserName",
      "People.FirstName",
      "People.LastName"
    }
  )
in
  #"Expanded People"

One key differences between OData sources and other web services is that query folding is possible on OData sources. Sure enough, if you use Query Diagnostics to see the calls made to the OData API, you’ll see the following call is made:

https://services.odata.org/TripPinRESTierService/(S(gprer1xbnmwlxoeb50woookg))/People?$filter=FirstName eq 'Scott' or FirstName eq 'Angel' or FirstName eq 'Ursula'&$select=UserName%2CFirstName%2CLastName

The Power Query engine has folded the Merge and generated an efficient request using the $filter and $select system query options without you needing to build one yourself.

The dataset will now refresh successfully after you have published it to the Power BI Service. There’s one last important point to mention: you’ll need to make sure the data privacy settings, set in the Power BI Service, for your data sources allow query folding to take place. To learn more about this topic you can watch the video here but TLDR if you’re OK with sending data from one data source to another then you can set all data sources to the “Public” privacy level you should be ok.

Why DAX Window Functions Are Important For Performance In Power BI DirectQuery Mode

The new DAX window functions (announced here, more details on Jeffrey Wang’s blog here and here) have generated a lot of excitement already – they are extremely powerful. However one important benefit of using them has not been mentioned so far: they can give you much better performance in DirectQuery mode because they make it more likely that aggregations are used. After all, the fastest DirectQuery datasets are the ones that can use aggregations (ideally Import mode aggregations) as much as possible.

To illustrate this, here’s a very simple dataset with a fact table in DirectQuery mode and a Date dimension table in Dual mode built on the SQL Server AdventureWorksDW2017 sample database:

Let’s start off with a simple measure that sums up the values in the SalesAmount column:

Sales Amount = SUM('Internet Sales'[SalesAmount])

When you use it in a table visual with the CalendarYear column from the Date table like so:

…Power BI can get the data it needs with a single SQL query. I won’t show the whole query here, but it’s a simple Group By and returns exactly what you’d expect if you run it in SQL Server Management Studio:

Now let’s say you want to do a year-on-year growth calculation. To do this, you’ll need to be able to find the Sales Amount for the previous year. One way to do this in DAX would be to use the SamePeriodsLastYear function like so:

LY Sales Amount V1 =
CALCULATE (
    [Sales Amount],
    SAMEPERIODLASTYEAR ( 'Date'[FullDateAlternateKey] )
)

Used in a table it gives the correct result:

However the SQL query generated for this visual is now a lot more, ahem, verbose and because the DAX time intelligence functions are all resolved at the date granularity it now returns Sales Amount summed up by date rather than by year:

If you write the same previous year sales measure using the new Offset function like so:

LY Sales Amount V2 =
CALCULATE (
    [Sales Amount],
    OFFSET (
        -1,
        ALLSELECTED ( 'Date'[CalendarYear] ),
        ORDERBY ( 'Date'[CalendarYear], ASC )
    )
)

…you get the same result as before:

…but now the SQL query is much simpler and returns data at the year granularity, as you’d want:

I’m not a SQL expert so I won’t comment on the SQL generated – besides, it may change in the future – but the most important implication of this is that the version of the measure that uses Offset is more likely to be able to use aggregations.

For example, if you add an extra import-mode table to the dataset with the data from the fact table aggregated to year granularity:

…and set it up as an aggregation table:

…then any queries at the year granularity should use it. As you would expect, the query for visual shown above with the measure using SamePeriodLastYear misses the aggregation and goes to the DirectQuery fact table:

The query for the measure using Offset, though, can use the aggregation and there is no DirectQuery activity at all:

This is just one example. I’m almost certain there are other ways to write this calculation without using Offset which will also hit the aggregation but they won’t be as elegant as the Offset version. What’s more, as your measures get more and more complex it gets harder and harder to write DAX that results in simple, efficient SQL in DirectQuery mode and the new window functions are a massive help here.

Eighteenth Blog Birthday: On The Question Of “Wouldn’t It Be Great If Power BI Did This?”

Every year, on the anniversary of the first-ever post on this blog, I write a post reflecting on what has happened to me professionally in the past year or discussing a non-technical (but still work-related) subject. Two years ago I wrote a post about people who ask the question “Why don’t you add this one simple feature to Power BI?”, something which seems even more relevant now than when I first wrote it. This year I want to discuss a similar topic: people who ask “Wouldn’t it be great if Power BI did this?”, where “this” is some amazing, complicated new piece of functionality they have thought of.

First of all, let me explain why “Wouldn’t it be great if Power BI did this?” is different from “Why don’t you add this one simple feature to Power BI?”. The “one simple feature” question is asked by people who come across the same problem in their day-to-day work so frequently that they can’t understand why Microsoft hasn’t addressed it yet – which is why they are often so frustrated and angry – and where the solution seems obvious and easy. Expanding/collapsing column headers, the top-voted idea on ideas.powerbi.com, is a great example of this type of question: lots of people want it, there are no arguments about how useful it would be, but as Amanda’s comment on the item says it’s actually a lot harder for us to implement than you would think. Hopefully we’ll be able to do it soon.

The “wouldn’t it be great” question is different because it is is prompted by long years of experience of BI tools and projects and is the result of some very creative thinking: the solutions suggested are never obvious, never going to appear on ideas.powerbi.com, and are much more strategic. In the past I’ve been very much the type of person to ask the “wouldn’t it be great” question. For example, about fifteen years ago I remember writing a long email to the Analysis Services team asking for something vaguely like DirectQuery on datasets; more recently I remember trying to convince Chris Finlan that it would be good if paginated reports could render to adaptive cards. Luckily for me the people I sent these emails to were always very polite, even if nothing ever came of my galaxy-brain proposals.

Why does someone like me love asking the “wouldn’t it be great” question so much? It’s the IT equivalent of fantasy football: if you spend way too much time thinking about your favourite football team/BI tool then it’s only natural to imagine how cool it would be to be the coach/program manager of that team/tool and make important decisions about its destiny. Maybe you think you could do a better job than the people who are actually in charge. It’s harmless fun and a great way to while away a few hours with friends over a beer at a conference.

It’s important to understand why the “wouldn’t it be great” question is not a good way of requesting changes to Power BI though. My colleagues Kasper de Jonge and Matthew Roche, both of whom have many years of experience as program managers, helped me with this when I first joined the Power BI CAT team. Now, when I think of a “wouldn’t it be great” idea, I hear Kasper’s voice in my head telling me that it’s the CAT team’s job to collect and curate feedback and it’s the program managers alone whose job it is to design the product. From Matthew I learned the word “solutionize”, which according to the definition I found here means “to come up with a solution for a problem that hasn’t been defined (and might not even exist)” where “common examples are things like developing a feature because you can, iterating and building without research, or selecting a platform or pattern before knowing the required functionality”. Asking the “wouldn’t it be great” question is suggesting a solution when actually the people who design the solutions need to hear the details of the problem you’re trying to solve. If you provide enough details of the problem then it’s possible that the program manager will come to the same conclusion as you about the solution, but even then there are so many other factors to take into account (such as whether it’s even possible to implement your solution) that nothing can be guaranteed.

So, in summary, if you have a problem that you think Power BI needs to solve then you should spend as much time as possible defining that problem rather than imagining what the solution could be – that’s the best way to ensure it gets the attention it deserves from Microsoft. To be clear I’m not saying that you shouldn’t suggest a solution too, I’m just saying that solutions are not what you should be focusing on.