Inlining KQL Query Fragments Using Value.NativeQuery() In Power Query

If the title wasn’t enough to warn you, this post is only going to be of interest to M ultra-geeks and people using Power BI with Azure Data Explorer – and I know there aren’t many people in either group. However I thought the feature I’m going to show you in this post is so cool I couldn’t resist blogging about it.

Let’s say I have a table in Azure Data Explorer called Product that looks like this:

Because this table is in Azure Data Explorer I can query it with KQL, and the thing I love most about KQL as a language is the way that each query is built up as a series of piped transformations. So, for example, on the table above I could write the following query:

Product
| where Type=='Berry'
| where FruitID >1

…and get this result:

It seems quite similar to the way M works, right? Anyway, now for the cool part. When you connect Power Query in Power BI up to Azure Data Explorer, not only does query folding take place but you can mix and match native KQL fragments (added using the Value.NativeQuery M function) and Power Query transformations created in M, and it all still folds! This completely blew my mind when I found out about it. It’s documented briefly here but the docs don’t explain everything that you can do with it.

Consider the following M query:

let
  Source
    = AzureDataExplorer.Contents(
        "https://xyz.ukwest.kusto.windows.net/", 
        "MyDatabase", 
        "Product"
      ),
  #"Filtered Rows"
    = Table.SelectRows(
        Source, 
        each ([Type] = "Berry")
      ),
  Q1 = Value.NativeQuery(
      #"Filtered Rows", 
      "| where FruitID>2"
    )
in
  Q1

There are three steps:

  1. Source connects to the Product table
  2. #”Filtered Rows” is a filter generated by the Power Query UI to filter the table down to just the rows where Type is “Berry”
  3. Q1 uses Value.NativeQuery to add a KQL filter onto the output of #”Filtered Rows” so only the rows where FruitID is greater than 2

Here’s the KQL this folds to:

Product
| where strcmp(["Type"], ("Berry")) == 0
| where FruitID>2

You can also use Value.NativeQuery more than once. Here’s another M query that returns the same result as above:

let
  Source
    = AzureDataExplorer.Contents(
        "https://xyz.ukwest.kusto.windows.net/", 
        "MyDatabase", 
        "Product"
      ),
  Q3 = Value.NativeQuery(
      Source, 
      "| where Type=='Berry'"
    ),
  Q4 = Value.NativeQuery(
      Q3, 
      "| where FruitID>2"
    )
in
  Q4

Very useful when you want to build a query quickly using the Power Query Editor but then need to use a KQL feature that isn’t supported by Power Query.

Using Small Multiples In Power BI To Improve Report Performance

While the long-awaited small multiples feature that previewed in the December 2020 release is an obvious boost to Power BI’s data visualisation capabilities, did you know that you can use it to improve report performance too?

Earlier this year I wrote blog posts showing how you can improve report performance by showing the same amount of data in fewer visuals (for example by replacing several cards with a single table) and how the number of visuals on a page affects report performance even if they aren’t displaying any data; several other people have written similar posts too. Small multiples are just another way you can replace several visuals with a single visual that displays the same data.

To illustrate, consider the following report with five separate line chart visuals on it that are identical apart from the fact that there is a different filter set on each one:

Here’s what Performance Analyzer shows when the page is refreshed:

In this case everything is fairly quick, but notice that each DAX query takes 10-12ms and by the time we have reached the “Count Of Sales by Date for Terraced” visual the total render duration has reached 710ms.

Now, here’s the same data in a single line chart visual using small multiples:

It’s the same data and the same charts, but look at what Performance Analyzer shows now:

There are two things to point out:

  1. There is only one DAX query which, at 12ms, performs about the same as each of the five DAX queries in the previous version of the report. In this case by requesting all the data in a single query, rather than five separate queries, Power BI has been able to optimise how it retieves the data it needs. This doesn’t mean that from a DAX point of view that the small multiples version of the report is five times faster than the original because Power BI will have run the five queries in parallel in that version, but in general you will see some improvement in overall performance from this consolidation of queries and in some cases this can be quite significant.
  2. While the sum of the visual display durations for each of the separate visuals is basically the same as the visual display duration for the small multiples visual – which makes sense because they display the same data in the same way – the total duration of the small multiples visual is 486ms compared to 710ms for the total duration of the Count Of Sales by Date for Terraced visual in the original version, so there has been a definite overall improvement in rendering time. In fact, Performance Analyzer doesn’t really give you an accurate way of measuring the overall time taken to render a report page. A much better technique is the one I blogged about here, and this suggests the overall performance saving from using small multiples is almost 500ms.

In conclusion, then, if you have any groups of visuals on your reports that can be replaced by a single small multiples visual then I recommend that you do so – you may see an improvement in performance as a result. Remember also that it’s still early days for small multiples: as more and more features are added to it, and as more and more visuals support small multiples, the more opportunity you will have to consolidate visuals.

Capturing SQL Queries Generated By A Power BI DirectQuery Dataset

If you’re using DirectQuery mode for one or more tables in your Power BI dataset, the chances are that you will want to see the SQL (or whatever query language your DirectQuery data source uses) that is generated by Power BI when your report is run. If you can view the queries that are run in the tooling for the data source itself, for example, using Extended Events or SQL Server Profiler for SQL Server, then great – but you may not have permissions to do this. The good news is that you can capture the SQL queries in Power BI Desktop too, even though it’s not always obvious how to do so.

For some data sources like SQL Server then Performance Analyzer will give you the SQL queries generated. All you need to do is go to the View tab in the main Power BI Desktop window, click on the Performance Analyzer button to display the Performance Analyzer pane, click on Start Recording and then Refresh Visuals, find the event corresponding to the visual whose queries you want to view, expand it and then click on the “Copy query” link:

This will copy the DAX query generated by your visual to the clipboard; in the case of SQL Server DirectQuery sources you’ll also get the SQL query generated for that DAX query.

However this method does not work for all DirectQuery data sources; for them you’ll need to use the Query Diagnostics functionality in the Power Query Editor. You need to open the Power Query Editor window, go to the Tools tab on the ribbon, click on the Start Diagnostics button, go back to the main Power BI window, refresh your visuals (you can use the Refresh visuals button in Performance Analyzer for this again) and then go back to the Power Query Editor and click the Stop Diagnostics button. When you do this several new Power Query queries will appear which contain diagnostics data. Go to the one that has a name that starts with “Diagnostics_Detailed” and somewhere in there – where exactly depends on the data source – you’ll find the query generated. For example, for a Snowflake data source you’ll see the SQL generated somewhere in the Data Source Query column:

For an Azure Data Explorer DirectQuery data source the KQL query will be in one of the Record values in the Additional Info column:

One thing to watch out for is that you may also see what look like SQL Server TSQL queries, even when you’re not using a data source that can be queried with TSQL. Here’s an example from the Azure Data Explorer example above:

You can ignore these queries: they’re not useful, although they do give you an interesting insight into how DirectQuery mode works behind the scenes.

Using Power BI Dynamic M Parameters In DAX Queries

As I’m sure you’ve guessed by now I’m a big fan of dynamic M parameters in Power BI. They’re easy to use in Power BI Desktop but what if you want to use them in your own DAX queries? Documentation for this is coming soon, but in the meantime I thought it would be useful to show the additions to DAX query syntax to support them – something you can see for yourself if you take a look at the DAX queries generated by Power BI Desktop using Performance Analyzer.

Here’s an example of a query generated by Power BI Desktop where there are three dynamic M parameters defined: DateParameter, TextParameter and NumericParameter.

DEFINE
  MPARAMETER DateParameter = 
    DATE(2020, 1, 1)

  MPARAMETER TextParameter = 
    "January"

  MPARAMETER NumericParameter = 
    1

  VAR __DS0FilterTable = 
    TREATAS({DATE(2020, 1, 1)}, 'ParamValues'[DateP])

  VAR __DS0FilterTable2 = 
    TREATAS({"January"}, 'ParamValues'[MonthNameP])

  VAR __DS0FilterTable3 = 
    TREATAS({1}, 'ParamValues'[MonthNoP])

  VAR __DS0Core = 
    CALCULATETABLE(
      DISTINCT('DATE'[DATE]),
      KEEPFILTERS(__DS0FilterTable),
      KEEPFILTERS(__DS0FilterTable2),
      KEEPFILTERS(__DS0FilterTable3)
    )

  VAR __DS0PrimaryWindowed = 
    TOPN(501, __DS0Core, 'DATE'[DATE], 1)

EVALUATE
  __DS0PrimaryWindowed

ORDER BY
  'DATE'[DATE]

The dynamic M parameters are set using a DEFINE statement and the new MParameter keyword. The name of the parameter here is the same as the name of the parameter defined in the Power Query Editor; one thing to point out is that if your M parameter name contains a space, it should be surrounded by single quotes.

If you have enabled multi-select on your dynamic M parameter, you will need to pass a table of values to it in your DAX query. Here’s an example of how to do this with a table constructor:

DEFINE
  MPARAMETER 'MonthParameter' = 
    {"January",
      "February"}

I’ll admit I haven’t tried this yet, but you should be able to do some really cool stuff with this in a paginated report connected to a DirectQuery dataset if it contains dynamic M parameters. Maybe in a future post…

Session Recording: “Advanced Analytics Features In Power BI”

Recently I had the pleasure of presenting for the team at Planilheiros (whose Portugese-language YouTube channel is almost as popular as Guy In A Cube) at their Power BI Summit. My session was on “Advanced Analytics Features in Power BI” and you can watch it here:

It’s on all the built-in analytics features in Power BI that you can use to make sense of your data – without using any DAX or M. I look at things like adding forecasts to line charts, adding ratio lines and symmetry lines to scatter charts, “explain the increase/decrease” on column charts, the Key Influencers and Decomposition Tree visuals and lots more.

Delaying Power BI Dataset Refresh Until The Source Data Is Ready

This week a customer came to me with the following problem: they had scheduled the refresh of their dataset but their source data wasn’t always ready in time, so the old data was being loaded by mistake. The best solution here is to use some kind of external service (for example Power Automate) to poll the data source regularly to see if it’s ready, and then to refresh the dataset via the Power BI REST API when it is. However, it got me thinking about a different way of tackling this: is it possible to write some M code that will do the same thing? It turns out that it is, but it’s quite complicated – so I don’t recommend you use the code below in the real world. Nevertheless I wanted to write up the solution I came up with because it’s interesting and you never know, it might be useful one day.

Here’s the scenario I implemented. Using a SQL Server data source, let’s say that you want to load the result of the following SQL query against the AdventureWorksDW2017 database into Power BI:

SELECT DISTINCT [EnglishDayNameOfWeek] FROM DimDate

However, let’s also say that the data in this table may not be ready at the time when dataset refresh is scheduled; when the data is ready, the sole value in the sole column of another table, called ContinueFlagTable, will be changed from False to True:

Therefore when dataset refresh starts we need to check the value from the ContinueFlagTable regularly; if it is False we need to wait for a given amount of time and then check again; if it is True the data can be loaded; if the flag hasn’t changed to True after checking a certain number of times then we need to raise an error.

Here’s the full M code of the Power Query query:

let
  //The SQL Server database
  Source = Sql.Database(
      "MyServerName", 
      "AdventureWorksDW2017"
    ),
  //The query to load into the dataset
  ResultQuery
    = Value.NativeQuery(
        Source, 
        "SELECT DISTINCT [EnglishDayNameOfWeek] FROM DimDate"
      ),
  //The number of times to check the Continue flag
  MaxIterations = 3,
  //The number of seconds to wait before checking the Continue flag
  DelaySeconds = 5,
  //Check the Continue flag
  Iterate = List.Generate(
      () => 1, 
      each (
      //Query the ContinueFlag table
      //after waiting the specified number of seconds
      //and keep trying until either it returns true
      //or we reach the max number of times to check(Function.InvokeAfter(
          () => 
            let
              dbo_ContinueFlag
                = Value.NativeQuery(
                    Source, 
                    "SELECT [ContinueFlagColumn] from ContinueFlagTable where -1<>@param", 
                    [
                      param
                        = _
                    ]
                  ),
              Continue
                = dbo_ContinueFlag{0}[ContinueFlagColumn]
            in
              Continue, 
          #duration(
              0, 
              0, 
              0, 
              DelaySeconds
            )
        )
        = false
      )
        and (_ < MaxIterations)
      ), 
      each _ + 1
    ),
  //Find how many times the ContinueFlag was checked
  NumberOfIterations
    = List.Max(Iterate),
  //Did we reach the max number of checks?
  ReturnError
    = NumberOfIterations
      = MaxIterations
      - 1,
  //Table type of the table to load into the dataset
  ReturnTableType
    = type table [
      EnglishDayNameOfWeek = Text.Type
    ],
  //Error message to return if the
  //max number of checks is reached
  ErrorMessage
    = error "After waiting "
      &amp; Text.From(
        MaxIterations
        * DelaySeconds
      )
      &amp; " seconds your data is still not ready to be loaded",
  //Handle Power BI Desktop's behaviour of asking for
  //the top 0 rows from the table
  //before loading the actual data
  OverrideZeroRowFilter = 
    if ReturnError then 
      ErrorMessage
    else 
      Table.View(
          null, 
          [
            GetType = () => 
              ReturnTableType, 
            GetRows = () => 
              ResultQuery, 
            OnTake
              = (
                count as number
              ) => 
                if count = 0 then 
                  #table(
                      ReturnTableType, 
                      {}
                    )
                else 
                  Table.FirstN(
                      ResultQuery, 
                      count
                    )
          ]
        )
in
  OverrideZeroRowFilter

There’s a lot to explain here:

  • Everything is in a single query, and this is deliberate: I found it was the only way I could make sure that the query to load the data (ie the query in the ResultQuery step) is run only once, after the check on the ContinueFlagTable has returned true. If I split the code to get the data off into a separate query, I found that it got queried before any checks on ContinueFlagTable; this seemed to be something to do with the formula firewall, but I’m not sure.
  • I’ve used List.Generate to do the polling of ContinueFlagTable. There’s a good example of how to use it to implement Do Loops here.
  • I’ve used Function.InvokeAfter to add the delays in between checks on ContinueFlagTable.
  • I found that if I ran the same query to check ContinueFlagTable, after the first execution the result was cached and the query couldn’t tell if the flag changed after that. Therefore I wrote a SQL query with a WHERE clause that was different each time but which always returned true; I did this by passing the iteration number into the query via a parameter, and I passed the parameter in using the Value.NativeQuery function.
  • It’s possible to raise an error when needed using the error keyword, described here.
  • The OverrideZeroRowFilter step at the end uses the technique I blogged about here to deal with the fact that Power BI Desktop asks for the top 0 rows from a table to get the schema before it loads the data.

Refreshing the table in Power BI Desktop with the value True in ContinueFlagTable loads the data successfully; in Profiler you can see one query to ContinueFlagTable and one query to get the data to be loaded from the DimDate table:

Refreshing the table with the value False in ContinueFlagTable shows the specified number of queries to ContinueFlagTable and no query to DimDate:

Here’s the error message you get in Power BI Desktop when this happens:

Here’s the same error in the Power BI Service when you try to refresh the dataset set:

Visualising Azure Analysis Services Processing Tasks With The Job Graph Events Sample

I’ll leave it to other people to get worked up about things like Power BI Premium gen2 – what really gets me excited is a new Profiler event type. And indeed there is a new one to play with – Job Graph – that you can use to visualise Azure Analysis Services processing jobs (it’s not enabled yet for the Analysis Services engine in Power BI).

More details about it, and how it can be used, are in the samples here:

https://github.com/microsoft/Analysis-Services/tree/master/ASJobGraphEvents

The data returned by the Job Graph event isn’t intelligible if you look at the text it returns in Profiler. However if you save a .trc file with Job Graph event data to XML you can use the Python scripts in the GitHub repo to generate DGML diagrams that can be viewed in Visual Studio, plus Gantt charts embedded in HTML. Of course to do this you’ll need to have Python installed; you’ll also need to have Visual Studio and its DGML editor installed (see here for details).

Here’s the DGML diagram for the refresh of a very simple AAS database with just one table in it, called People:

The eight nodes in a semi-circle around the top of the diagram represent the eight jobs that process the attribute hierarchies in the table. These eight jobs must be completed before the single partition in the table is processed (this is the node in the centre of the diagram) and that partition must be processed before the table itself is processed (this is the node at the bottom of the diagram).

Since this is the annotated output you can hover over a node in the diagram and see more details about it, including its duration and how long it was blocked for:

Notice how three nodes have dark backgrounds? These nodes represent the critical path. From the sample documentation:

When the job graph executes, there is always a job that finishes last before the engine can commit the change. This job that finishes last is the “critical dependency” for the commit; the entire commit needs to wait for this one job to finish before executing. This last job depends on other jobs, one of which finished after all the others. This is the next critical dependency. Tracing this path of critical dependencies forms the critical path, which helps engineers and customers identify why processing takes so long.

To diagnose slow refresh times, look for the critical path and start at the top. Look at “Blocked duration”, “Waiting duration”, and “Running duration”. If a job has a long blocked duration, it spent a long time waiting on other jobs. If a job has a long waiting duration, it was waiting for an available thread, so increasing maxParallelism could help. If a job has a long running duration, then the model might have to be changed to speed up the job.

Here’s another diagram, generated from a slightly more complex database with four tables, a calculated column and a calculated table:

There’s also a script that takes this data and generates a Gantt chart from it, but at the time of writing it seems to be a bit buggy; I can make it work if I comment out some code but I suspect that means the output isn’t accurate. Hopefully it will get fixed soon, but in the meantime here’s an example of what it produces:

I think this is going to be really useful when I’m troubleshooting processing performance problems. It would also be really cool to have this in DAX Studio (hint, hint)…

Handling Multi-select In Power BI Dynamic M Parameters

Even though the documentation for dynamic M parameters does mention how to handle multi-select in the M code for your Power Query queries, I thought it would be useful to provide a detailed example of how to do this and explain what happens behind the scenes when you use multi-select.

Once again I’m going to use Azure Data Explorer as my DirectQuery data source in this post. Consider the following simple KQL query, which returns a list of counties in the UK and the average price paid for real estate in them:

pricepaid
| where county in ('DEVON', 'CORNWALL', 'KENT')
| summarize avg(price) by county

The third line of this query uses a combination of the KQL where and in operators to filter the list of counties in a way that’s very similar to other query languages such as SQL. The challenge is to write an M expression that will generate the comma-delimited list of county names in parantheses at the end of this line.

Taking the query above and generating a non-dynamic DirectQuery table in Power BI is straightforward, and let’s assume that you have already created an M parameter called Selected County:

… a table of county names:

…and bound the County column of this table to the SelectedCounty M parameter and turned on the Multi-select option in Model View in the main Power BI Desktop window:

Now the confusing thing for me, when I first started to look at this problem, was that the M parameter called SelectedCounty I created was of type Text but the example code in the documentation was written for an M parameter of type list – when in fact it isn’t possible to create an M parameter of type list in the Power Query Editor (at least not at the time of writing). It turns out that when the Multi-select option is turned on Power BI is able to send a value of type list to the parameter regardless of what type you have defined for it.

With that knowledge here’s an example of an M query to generate the query using the parameter:

let
  CountyList = 
    if 
    //check to see if the parameter is a list
      Type.Is(
        Value.Type(SelectedCounty), 
        List.Type
      ) then 
        //if it is a list
        let
          //add single quotes around each value in the list
          AddSingleQuotes = List.Transform(
              SelectedCounty, 
              each "'" &amp; _ &amp; "'"
            ),
          //then turn it into a comma-delimited list
          DelimitedList = Text.Combine(
              AddSingleQuotes, 
              ","
            )
        in
          DelimitedList
    else 
      //if the parameter isn't a list
      //just add single quotes around the parameter value
      "'" &amp; SelectedCounty &amp; "'",
  //generate and run the KQL query
  Source = AzureDataExplorer.Contents(
      "https://mycluster.northeurope.kusto.windows.net", 
      "pricepaid", 
      "pricepaid#(lf)
       | where county in 
       ("
        &amp; CountyList &amp;
       ")#(lf)
       | summarize avg(price) by county", 
      [
        MaxRows                 = null, 
        MaxSize                 = null, 
        NoTruncate              = null, 
        AdditionalSetStatements = null
      ]
    )
in
  Source

Note that the CountyList step has to check the data type of the parameter using an if statement, because in the Query Editor it will always be Text whereas in the report it will be a list if multi-select is turned on and Text if not.

Here’s the final report showing a multiselect slicer passing values into this query:

[Thanks to Ravi Kiran Vemulapalli and Sujata Narayana for their help with this post]

Geospatial Analysis With Azure Data Explorer, Power BI And Dynamic M Parameters

Since last week’s blog post about dynamic M parameters generated so much interest, this week I thought I’d give you another example of something cool you can do with them when you’re using Azure Data Explorer (ADX) as a DirectQuery source in Power BI: geospatial analysis.

Let’s say you work for a chain of supermarkets and want to use Power BI see what other competing stores are close to one of your stores. First of all you need some data, in in this case my data source was this list of British supermarkets and their locations , published by the nice people at Geolytix, and called Retail Points. Once again I loaded the csv file into ADX and was able to write a KQL query using the geo_point_in_circle() function to filter the list of supermarkets down to those within a specified distance of a given supermarket. For example, here’s a simple KQL query that finds all the supermarkets within a 5000m radius of the location of my local Marks and Spencer store in Amersham:

let centrestore =  
retailpoints
| where store_name == "M&amp;S Amersham SF"
| project long_wgs, lat_wgs;
let centrelat = toscalar( centrestore | summarize max(lat_wgs));
let centrelong = toscalar ( centrestore | summarize max(long_wgs));
retailpoints
| where geo_point_in_circle(long_wgs, lat_wgs, centrelong, centrelat, 5000)

[Apologies if this isn’t the greatest KQL query – I’m still learning]

I was then able to create two M parameters called selectedstore and selectedradius and bind them to two other tables in my dataset using dynamic M parameters so that the end user could use slicers to select:

  • The store they want to use as the central point of their analysis
  • The size in metres of the radius of the circle around the central point within which other stores should be displayed

These M parameters could then be injected into the KQL query of a DirectQuery table using the following M query:

let
  Source = AzureDataExplorer.Contents(
      "https://mycluster.northeurope.kusto.windows.net", 
      "retailpoints", 
      Text.Format(
          "declare query_parameters(filterstore:string = ""#[storeMparametername]"", centreradius:int = #[radiusMparametername]);#(lf)let centrestore =  #(lf)retailpoints#(lf)| where store_name == filterstore#(lf)| project long_wgs, lat_wgs;#(lf)let centrelat = toscalar( centrestore | summarize max(lat_wgs));#(lf)let centrelong = toscalar ( centrestore | summarize max(long_wgs));#(lf)retailpoints#(lf)| where geo_point_in_circle(long_wgs, lat_wgs, centrelong, centrelat, centreradius)", 
          [
            storeMparametername  = selectedstore, 
            radiusMparametername = selectedradius
          ]
        ), 
      [
        MaxRows                 = null, 
        MaxSize                 = null, 
        NoTruncate              = null, 
        AdditionalSetStatements = null
      ]
    )
in
  Source

There are two important things to notice here:

  • I declared KQL query parameters to hold the selected store name and selected radius at the start of the KQL query. As the documentation notes, this is important to guard against KQL injection attacks.
  • I used the M Text.Format function to inject the M parameter values into the KQL query parameter declarations, which was easier and cleaner than generating the KQL query string by concatenating text.

And here’s the report I built, for example showing how you can select an Aldi store in Aberdeen and display all stores within a 1km, 3km and 7km radius of that store:

Not the most sophisticated geospatial report in the world, I admit, and I know very few people are using Azure Data Explorer as a data source for Power BI right now, but I do think it is a fun example of the kind analysis that dynamic M parameters now make possible.

Why I’m Excited About Dynamic M Parameters In Power BI

My favourite feature in the October 2020 release of Power BI is undoubtedly dynamic M parameters. However, before I explain why, I’m going to put my hands up and admit that we screwed up the communication around its launch and didn’t explain its limitations properly – which led to some understandable disappointment and confusion.

Although the blog post and the docs have now been updated to make it clear that this feature doesn’t work with many popular data sources like SQL Server, Synapse, Oracle, Teradata and SAP (see here for the full details), I thought I’d explain why. It’s because there are two sets of connector that Power BI uses for DirectQuery: an older set that was built some time ago for the data sources originally supported by Analysis Services Tabular, and a newer set that is built in Power Query’s M language. Dynamic M parameters only work for this newer set of connectors; if you want to see this feature extended so it works for the older connectors too then please vote on ideas.powerbi.com, talk to your Microsoft account team, call Satya or whatever your normally do.

OK, so why am I excited? Well, because dynamic M parameters let you do a whole bunch of cool things that weren’t possible before. My original plan for this blog post was to show how they can be used to change the dimension column used in a visual dynamically but Mim beat me to it with this blog post. Instead in this post I’m going to show you how you can use them to create something like dynamic calculated columns and I also want to make a more general point that will Blow Your Mind (hopefully).

For my example I’m going to use Azure Data Explorer (ADX) as my data source. There are a couple of reasons why: it’s something I’ve been interested to learn more about for a while, and the ADX team recently announced a bunch of new features which make me think it will be increasingly useful in BI scenarios; it will be important to know it if you’re using the new Azure Monitor integration for Power BI log data; and, crucially, it’s one of the few M-based connectors where you can write your own queries in DirectQuery mode (although this is coming to Snowflake and Redshift next year too).

My source data is from the UK Land Registry Price Paid dataset, and for my demo report I loaded a table containing around 790,000 real estate transactions into ADX. Each real estate transaction has the price paid for the property:

One obvious way to analyse this data is to build a histogram and show the number of transactions that occur in price bins such as £0-£100000, £100000-£200000, £200000-£300000 and so on. This is quite easy to do in Power BI, for example using calculated columns in Import mode, unless you want to allow your users to be able to change the bin sizes themselves. Dynamic M parameters give you an easy way to solve this problem though.

KQL, the query language for ADX, has a really nice way of generating bins for histograms using the summarize() operator and bin() function. For example, the following KQL query does something like a SQL group by and shows the number of real estate transactions for each county and bins based on price in £25000 increments:

pricepaid 
| summarize Transactions=count() 
by county, bin(price, 25000)

In Power BI I created a parameter called binsizes:

…and then created a parameterised version of the KQL query above to populate a DirectQuery table. Here’s the M code for this table:

let
    Source = 
     AzureDataExplorer.Contents(
      "https://mycluster.northeurope.kusto.windows.net", 
      "pricepaid", 
      "pricepaid | summarize Transactions=count() by county, bin(price, " 
      &amp; Number.ToText(binsizes) &amp; 
      ")", 
      [MaxRows=null, MaxSize=null, NoTruncate=null, AdditionalSetStatements=null]
     )
in
    Source

Next I created a new table with the possible bin sizes that the user can choose using a DAX calculated table:

And then finally linked the column in this new table with the M parameter:

Here’s the final report, where you can see the number of transactions in each price bin and use slicers to change the price bins and change the county you’re looking at:

Apart from being a new solution to an old problem, what does this mean? Well, with a data source like ADX where M parameters are supported and you can write your own queries in DirectQuery mode, you can do whatever you want in the query so long as it always returns the same columns. As well as creating parameterised queries, in ADX you could also create a user-defined function (something like table-valued functions in SQL Server), pass parameters into it using M parameters, and have a totally dynamically generated fact table in your dataset. Either way, using dynamic M parameters in this way allows you to use KQL’s specialised functions for basket analysis or geospatial analysis to do things that are otherwise impossible in Power BI. I’m sure similar things are possible with other data sources too. And that, my friends, is why I’m so excited about dynamic M parameters.

%d bloggers like this: