Why I’m Excited About Dynamic M Parameters In Power BI

My favourite feature in the October 2020 release of Power BI is undoubtedly dynamic M parameters. However, before I explain why, I’m going to put my hands up and admit that we screwed up the communication around its launch and didn’t explain its limitations properly – which led to some understandable disappointment and confusion.

Although the blog post and the docs have now been updated to make it clear that this feature doesn’t work with many popular data sources like SQL Server, Synapse, Oracle, Teradata and SAP (see here for the full details), I thought I’d explain why. It’s because there are two sets of connector that Power BI uses for DirectQuery: an older set that was built some time ago for the data sources originally supported by Analysis Services Tabular, and a newer set that is built in Power Query’s M language. Dynamic M parameters only work for this newer set of connectors; if you want to see this feature extended so it works for the older connectors too then please vote on ideas.powerbi.com, talk to your Microsoft account team, call Satya or whatever your normally do.

OK, so why am I excited? Well, because dynamic M parameters let you do a whole bunch of cool things that weren’t possible before. My original plan for this blog post was to show how they can be used to change the dimension column used in a visual dynamically but Mim beat me to it with this blog post. Instead in this post I’m going to show you how you can use them to create something like dynamic calculated columns and I also want to make a more general point that will Blow Your Mind (hopefully).

For my example I’m going to use Azure Data Explorer (ADX) as my data source. There are a couple of reasons why: it’s something I’ve been interested to learn more about for a while, and the ADX team recently announced a bunch of new features which make me think it will be increasingly useful in BI scenarios; it will be important to know it if you’re using the new Azure Monitor integration for Power BI log data; and, crucially, it’s one of the few M-based connectors where you can write your own queries in DirectQuery mode (although this is coming to Snowflake and Redshift next year too).

My source data is from the UK Land Registry Price Paid dataset, and for my demo report I loaded a table containing around 790,000 real estate transactions into ADX. Each real estate transaction has the price paid for the property:

One obvious way to analyse this data is to build a histogram and show the number of transactions that occur in price bins such as £0-£100000, £100000-£200000, £200000-£300000 and so on. This is quite easy to do in Power BI, for example using calculated columns in Import mode, unless you want to allow your users to be able to change the bin sizes themselves. Dynamic M parameters give you an easy way to solve this problem though.

KQL, the query language for ADX, has a really nice way of generating bins for histograms using the summarize() operator and bin() function. For example, the following KQL query does something like a SQL group by and shows the number of real estate transactions for each county and bins based on price in £25000 increments:

pricepaid 
| summarize Transactions=count() 
by county, bin(price, 25000)

In Power BI I created a parameter called binsizes:

…and then created a parameterised version of the KQL query above to populate a DirectQuery table. Here’s the M code for this table:

let
    Source = 
     AzureDataExplorer.Contents(
      "https://cwebb.northeurope.kusto.windows.net", 
      "pricepaid", 
      "pricepaid | summarize Transactions=count() by county, bin(price, " 
      & Number.ToText(binsizes) & 
      ")", 
      [MaxRows=null, MaxSize=null, NoTruncate=null, AdditionalSetStatements=null]
     )
in
    Source

Next I created a new table with the possible bin sizes that the user can choose using a DAX calculated table:

And then finally linked the column in this new table with the M parameter:

Here’s the final report, where you can see the number of transactions in each price bin and use slicers to change the price bins and change the county you’re looking at:

Apart from being a new solution to an old problem, what does this mean? Well, with a data source like ADX where M parameters are supported and you can write your own queries in DirectQuery mode, you can do whatever you want in the query so long as it always returns the same columns. As well as creating parameterised queries, in ADX you could also create a user-defined function (something like table-valued functions in SQL Server), pass parameters into it using M parameters, and have a totally dynamically generated fact table in your dataset. Either way, using dynamic M parameters in this way allows you to use KQL’s specialised functions for basket analysis or geospatial analysis to do things that are otherwise impossible in Power BI. I’m sure similar things are possible with other data sources too. And that, my friends, is why I’m so excited about dynamic M parameters.

Build Scalable BI Solutions Using Power BI and Snowflake

On the Power BI team we want our customers to be successful whichever data source they’re using – not just the Microsoft ones. Recently I had the pleasure of recording a webinar on the subject of best practices for using Power BI with Snowflake, along with Craig Collier from Snowflake and Chris Holliday from Visual BI (a partner that specialises in Power BI/Snowflake solutions). You can watch it here:

https://info.microsoft.com/ww-landing-build-scalable-BI-solutions-using-power-BI-and-snowflake.html

There’s a summary of what we talk about on the Snowflake blog:

https://www.snowflake.com/blog/maximizing-power-bi-with-snowflake/

We’re continuously improving the performance and functionality of all our connectors, and Snowflake is no exception: we recently announced that in early 2021 we’ll have support for Snowflake roles and the ability to use your own SQL queries in DirectQuery mode. Power BI support for Azure Service Tags (on the roadmap here) will also be very useful for Snowflake customers.

A Function To Visualise Parallelism In Power Query Diagnostics Data

Most of the time I’ve spent looking at Power Query Query Diagnostics data, I’ve been looking at data for a single query. Over the past few days though I’ve spent some time investigating what this data shows for all the queries that are executed for a single dataset refresh. To help me do this I wrote the following M function:

(InputTable as table) => 
  let
    Source = InputTable,
    EarliestStart = List.Min(Source[Start Time]),
    AddRelativeStart = Table.AddColumn(
        Source, 
        "Relative Start", 
        each [Start Time] - EarliestStart
      ),
    AddRelativeEnd = Table.AddColumn(
        AddRelativeStart, 
        "Relative End", 
        each [End Time] - EarliestStart
      ),
    SetToDurations = Table.TransformColumnTypes(
        AddRelativeEnd, 
        {
          {"Relative Start", type duration}, 
          {"Relative End", type duration}
        }
      ),
    CalculatedTotalSeconds
      = Table.TransformColumns(
          SetToDurations, 
          {
            {
              "Relative Start", 
              Duration.TotalSeconds, 
              type number
            }, 
            {
              "Relative End", 
              Duration.TotalSeconds, 
              type number
            }
          }
        ),
    GroupedRows = Table.Group(
        CalculatedTotalSeconds, 
        {"Id", "Query"}, 
        {
          {
            "Relative Start", 
            each List.Min([Relative Start]), 
            type number
          }, 
          {
            "Relative End", 
            each List.Max([Relative End]), 
            type number
          }
        }
      ),
    MergeColumns = Table.AddColumn(
        GroupedRows, 
        "Id Query", 
        each Text.Combine({[Id], [Query]}, " "), 
        type text
      ),
    IdToNumber = Table.TransformColumnTypes(
        MergeColumns, 
        {{"Id", type number}}
      )
  in
    IdToNumber

If you invoke this function and pass in the “Detailed” query diagnostics data query:

…you’ll end up with a query that gives you a table that looks something like this:

This table has one row for each Power Query query that got executed while the diagnostics trace was running, an Id column to uniquely identify each execution, the name of the query executed, an Id Query column that concatenates the previous two columns, and Relative Start and Relative End columns that give you the number of seconds from the start time of the first query executed up to the start time and end time of the query on the current row.

Loading the data into Power BI allows you to build a stacked bar chart with Id Query on the axis and Relative Start and Relative End together in the values:

If you then set the Sort By Column property of Id Query to the Id column (which is numeric, so this ensures that the values in Id Query are sorted correctly) and set the Data Color property of the Relative Start values to white (or whatever the background of your chart is) so that it’s invisible, then finally set the axis of your stacked bar chart to be sorted by Id Query in ascending order, you get a visual something like this:

This is a kind of Gantt chart where the x axis shows the number of seconds since the start of the first query execution and each bar shows the start and end times of each query, making it easy to see which queries are executing in parallel.

“What can this show us?” I hear you say. Well, that’s something for a future blog post. All that I can say right now is that it’s COMPLICATED and I don’t properly understand it myself yet. But it’s undoubtedly interesting.

View Native Query Now Works For Analysis Services Data Sources

If you’re familiar with the topic of query folding in Power Query, you’ll know that the View Native Query right-click option in the Applied Steps pane of the Power Query Editor can be used to show the native query that is run against the data source. You may also know that there are some data sources where query folding does take place but where View Native Query remains greyed out. One of those used to be Analysis Services, but the good news is that that is no longer the case: you can use View Native Query when importing data from Analysis Services! Look:

I’m told it also now works for SAP BW, but I haven’t tested it.

%d bloggers like this: