Why I’m Excited About Dynamic M Parameters In Power BI

My favourite feature in the October 2020 release of Power BI is undoubtedly dynamic M parameters. However, before I explain why, I’m going to put my hands up and admit that we screwed up the communication around its launch and didn’t explain its limitations properly – which led to some understandable disappointment and confusion.

Although the blog post and the docs have now been updated to make it clear that this feature doesn’t work with many popular data sources like SQL Server, Synapse, Oracle, Teradata and SAP (see here for the full details), I thought I’d explain why. It’s because there are two sets of connector that Power BI uses for DirectQuery: an older set that was built some time ago for the data sources originally supported by Analysis Services Tabular, and a newer set that is built in Power Query’s M language. Dynamic M parameters only work for this newer set of connectors; if you want to see this feature extended so it works for the older connectors too then please vote on ideas.powerbi.com, talk to your Microsoft account team, call Satya or whatever your normally do.

OK, so why am I excited? Well, because dynamic M parameters let you do a whole bunch of cool things that weren’t possible before. My original plan for this blog post was to show how they can be used to change the dimension column used in a visual dynamically but Mim beat me to it with this blog post. Instead in this post I’m going to show you how you can use them to create something like dynamic calculated columns and I also want to make a more general point that will Blow Your Mind (hopefully).

For my example I’m going to use Azure Data Explorer (ADX) as my data source. There are a couple of reasons why: it’s something I’ve been interested to learn more about for a while, and the ADX team recently announced a bunch of new features which make me think it will be increasingly useful in BI scenarios; it will be important to know it if you’re using the new Azure Monitor integration for Power BI log data; and, crucially, it’s one of the few M-based connectors where you can write your own queries in DirectQuery mode (although this is coming to Snowflake and Redshift next year too).

My source data is from the UK Land Registry Price Paid dataset, and for my demo report I loaded a table containing around 790,000 real estate transactions into ADX. Each real estate transaction has the price paid for the property:

One obvious way to analyse this data is to build a histogram and show the number of transactions that occur in price bins such as £0-£100000, £100000-£200000, £200000-£300000 and so on. This is quite easy to do in Power BI, for example using calculated columns in Import mode, unless you want to allow your users to be able to change the bin sizes themselves. Dynamic M parameters give you an easy way to solve this problem though.

KQL, the query language for ADX, has a really nice way of generating bins for histograms using the summarize() operator and bin() function. For example, the following KQL query does something like a SQL group by and shows the number of real estate transactions for each county and bins based on price in £25000 increments:

pricepaid 
| summarize Transactions=count() 
by county, bin(price, 25000)

In Power BI I created a parameter called binsizes:

…and then created a parameterised version of the KQL query above to populate a DirectQuery table. Here’s the M code for this table:

let
    Source = 
     AzureDataExplorer.Contents(
      "https://cwebb.northeurope.kusto.windows.net", 
      "pricepaid", 
      "pricepaid | summarize Transactions=count() by county, bin(price, " 
      & Number.ToText(binsizes) & 
      ")", 
      [MaxRows=null, MaxSize=null, NoTruncate=null, AdditionalSetStatements=null]
     )
in
    Source

Next I created a new table with the possible bin sizes that the user can choose using a DAX calculated table:

And then finally linked the column in this new table with the M parameter:

Here’s the final report, where you can see the number of transactions in each price bin and use slicers to change the price bins and change the county you’re looking at:

Apart from being a new solution to an old problem, what does this mean? Well, with a data source like ADX where M parameters are supported and you can write your own queries in DirectQuery mode, you can do whatever you want in the query so long as it always returns the same columns. As well as creating parameterised queries, in ADX you could also create a user-defined function (something like table-valued functions in SQL Server), pass parameters into it using M parameters, and have a totally dynamically generated fact table in your dataset. Either way, using dynamic M parameters in this way allows you to use KQL’s specialised functions for basket analysis or geospatial analysis to do things that are otherwise impossible in Power BI. I’m sure similar things are possible with other data sources too. And that, my friends, is why I’m so excited about dynamic M parameters.

Build Scalable BI Solutions Using Power BI and Snowflake

On the Power BI team we want our customers to be successful whichever data source they’re using – not just the Microsoft ones. Recently I had the pleasure of recording a webinar on the subject of best practices for using Power BI with Snowflake, along with Craig Collier from Snowflake and Chris Holliday from Visual BI (a partner that specialises in Power BI/Snowflake solutions). You can watch it here:

https://info.microsoft.com/ww-landing-build-scalable-BI-solutions-using-power-BI-and-snowflake.html

There’s a summary of what we talk about on the Snowflake blog:

https://www.snowflake.com/blog/maximizing-power-bi-with-snowflake/

We’re continuously improving the performance and functionality of all our connectors, and Snowflake is no exception: we recently announced that in early 2021 we’ll have support for Snowflake roles and the ability to use your own SQL queries in DirectQuery mode. Power BI support for Azure Service Tags (on the roadmap here) will also be very useful for Snowflake customers.

A Function To Visualise Parallelism In Power Query Diagnostics Data

Most of the time I’ve spent looking at Power Query Query Diagnostics data, I’ve been looking at data for a single query. Over the past few days though I’ve spent some time investigating what this data shows for all the queries that are executed for a single dataset refresh. To help me do this I wrote the following M function:

(InputTable as table) => 
  let
    Source = InputTable,
    EarliestStart = List.Min(Source[Start Time]),
    AddRelativeStart = Table.AddColumn(
        Source, 
        "Relative Start", 
        each [Start Time] - EarliestStart
      ),
    AddRelativeEnd = Table.AddColumn(
        AddRelativeStart, 
        "Relative End", 
        each [End Time] - EarliestStart
      ),
    SetToDurations = Table.TransformColumnTypes(
        AddRelativeEnd, 
        {
          {"Relative Start", type duration}, 
          {"Relative End", type duration}
        }
      ),
    CalculatedTotalSeconds
      = Table.TransformColumns(
          SetToDurations, 
          {
            {
              "Relative Start", 
              Duration.TotalSeconds, 
              type number
            }, 
            {
              "Relative End", 
              Duration.TotalSeconds, 
              type number
            }
          }
        ),
    GroupedRows = Table.Group(
        CalculatedTotalSeconds, 
        {"Id", "Query"}, 
        {
          {
            "Relative Start", 
            each List.Min([Relative Start]), 
            type number
          }, 
          {
            "Relative End", 
            each List.Max([Relative End]), 
            type number
          }
        }
      ),
    MergeColumns = Table.AddColumn(
        GroupedRows, 
        "Id Query", 
        each Text.Combine({[Id], [Query]}, " "), 
        type text
      ),
    IdToNumber = Table.TransformColumnTypes(
        MergeColumns, 
        {{"Id", type number}}
      )
  in
    IdToNumber

If you invoke this function and pass in the “Detailed” query diagnostics data query:

…you’ll end up with a query that gives you a table that looks something like this:

This table has one row for each Power Query query that got executed while the diagnostics trace was running, an Id column to uniquely identify each execution, the name of the query executed, an Id Query column that concatenates the previous two columns, and Relative Start and Relative End columns that give you the number of seconds from the start time of the first query executed up to the start time and end time of the query on the current row.

Loading the data into Power BI allows you to build a stacked bar chart with Id Query on the axis and Relative Start and Relative End together in the values:

If you then set the Sort By Column property of Id Query to the Id column (which is numeric, so this ensures that the values in Id Query are sorted correctly) and set the Data Color property of the Relative Start values to white (or whatever the background of your chart is) so that it’s invisible, then finally set the axis of your stacked bar chart to be sorted by Id Query in ascending order, you get a visual something like this:

This is a kind of Gantt chart where the x axis shows the number of seconds since the start of the first query execution and each bar shows the start and end times of each query, making it easy to see which queries are executing in parallel.

“What can this show us?” I hear you say. Well, that’s something for a future blog post. All that I can say right now is that it’s COMPLICATED and I don’t properly understand it myself yet. But it’s undoubtedly interesting.

View Native Query Now Works For Analysis Services Data Sources

If you’re familiar with the topic of query folding in Power Query, you’ll know that the View Native Query right-click option in the Applied Steps pane of the Power Query Editor can be used to show the native query that is run against the data source. You may also know that there are some data sources where query folding does take place but where View Native Query remains greyed out. One of those used to be Analysis Services, but the good news is that that is no longer the case: you can use View Native Query when importing data from Analysis Services! Look:

I’m told it also now works for SAP BW, but I haven’t tested it.

Exporting Data From Websites For Use In Power BI Using Power Automate UI Flows

As a Power BI developer I know how Power Query makes it easy to get data from web sources, but I also know there are limits to what it can do. I can get data from tables on web pages, I can get data from web services, but when confronted with a website where you have to click a button to download a file there’s a problem. Take https://data.police.uk/data/ for example:

This site allows you to download crime data for England and Wales, but to do so you need to check some boxes and then click the “Generate file” button at the bottom (yes I know there’s also an API, but this is only an example). You can’t click a button in Power Query.

Enter Power Automate UI Flows. I’ve been a fan of Power Automate for a while now but it was only the release of the new Power Automate Desktop preview last week that I decided to check out its Robotic Process Automation (RPA) capabilities. Basically, if you’re still doing lots of manual downloading, clicking, moving files etc before you can load your data into Power BI then Power Automate UI Flows can do all this for you automatically by recording and replaying everything you do.

Now I’m not a UI Flows expert by any stretch of the imagination, but within an hour I was able to create a UI Flow that:

  • Opened a browser and went to https://data.police.uk/data/ (quick tip: if you want to save a file from a browser you need to use the special Automation Browser, not Chrome, Edge or Firefox)
  • Checked the boxes to download crime, outcome and stop and search data for Thames Valley Police
  • Clicked the Generate file button
  • Clicked the Download Now button on the next screen when it appeared
  • Downloaded the resulting zip file to a particular location
  • Unzipped the file to a subfolder
  • Deleted the zip file

Here’s what it looked like in Power Automate Desktop:

Now this is my first ever UI Flow so it’s probably not as robust as it could be, but it seemed to work ok. I was then able to call it from Power Automate and refresh a dataset once the UI Flow had finished:

I guess the point of this post is not to tell you how to use Power Automate UI Flows to extract data from web pages for use with Power BI, just that it’s there for you to use if you need it. It’s important to be aware of the other tools that are available in the Power Platform and understand how they can be used alongside Power BI, rather than focusing on trying to solve your problems with Power BI alone. Not only are there a lot of cool things happening with Power Automate that are directly relevant to Power BI (such as the new Export actions and the upcoming ability to kick off a Flow when a Power BI dataflow finishes refreshing – see the demo at 33:00 here) but there are other features in Power Automate like UI Flows that can be used for BI purposes, such as the ability to run an Office Script from a Flow which can be useful for preparing Excel data for Power BI.

If you want to learn more about Power Automate Desktop there’s a good multi-part tutorial on YouTube starting here; there’s also a three-part series on extracting data from web pages too (part 1, part 2, part 3) although it uses an example that Power Query handles much more elegantly 😉

Power BI FastTrack Recognized Solution Architect 2021 Nominations Open

Are you really, really good at designing Power BI solutions? If so, you should nominate yourself for Microsoft’s Power BI FastTrack Recognized Solution Architect (FTRSA) distinction! Here’s the official description of what this is:

FastTrack Recognized Solution Architect is a recognition conferred by the Power Platform product engineering team upon a practicing solution architect for consistently exhibiting deep architecture expertise and creating high quality solutions during customer engagements. A FastTrack Recognized Solution Architect typically works for a systems integrator partner.

You can see some of the people who received this distinction at the Microsoft Business Applications Summit 2020 here:

https://powerplatform.microsoft.com/en-us/fasttrack/recognized-solution-architects/

I’ll be honest, the bar is set extremely high and only the very best will be successful. For an architect to be eligible for the FTRSA distinction, he/she must meet the following requirements:

  • Must have a minimum of 2 years of experience with Power BI and a minimum of 5 years of experience with Enterprise BI solutions.
  • Must have a minimum of 2 years of experience as an Enterprise BI architect.
  • Must be working for a partner with Gold certification in Data Analytics MPN Competency
  • Must have been lead architect for at least 2 Power BI in-production implementations with at least 200 active users.

Why nominate yourself?

  • FTRSA profiles, and the partners that they work for, will be listed on a public-facing page on the Power Platform product websites in recognition of their contributions towards driving customer success.
  • Partners can indicate the number of FTRSA(s) they have in their promotional materials. They may also point prospective customers to the public-facing profile listing page mentioned above.
  • The architects will get an e-badge that can be shared on LinkedIn and other social media platforms.
  • The architects will receive a special mention at the annual Microsoft Business Applications Summit.

If you’re interested and meet all the requirements you can nominate yourself here:

http://aka.ms/FTRSANomination

Nominations close on November 30th 2020.

Power BI Report Performance And The Number Of Visuals On A Page

When you’re faced with a slow Power BI report it’s very easy to assume that the problem is something to do with the dataset, how it’s modelled and how the DAX has been written, and then disappear down a rabbit hole of Marco-and-Alberto videos trying to shave milliseconds of the time taken by each DAX query the report runs. It’s certainly an assumption that I make all the time, and indeed most performance problems can be fixed in this way. However this can mean that you miss other important ways of improving report performance.

Two things happened last week to remind me of this. First of all I spent some time with my colleague Miguel Myers (check out his YouTube channel) who showed me how he had tuned a customer’s report and reduced one page’s time from 27 seconds to 10 seconds without making any changes to the dataset or DAX while still showing the same information on the page. He did this partly by using a smaller number of visuals to display the same values – something I blogged about here when I learned this from him earlier this year – but mostly by reducing the number of visuals like shapes and textboxes that don’t display any data at all. Secondly, I was helping some other colleagues with load testing a Power BI report using this tool, and trying to understand whether there was a performance bottleneck inside Power BI or on the client machines running the load test; it turned out that it was the latter problem.

You may have read the title of this post and guessed that I’m going to talk about reducing the number of visuals that display data from your dataset as a way of improving performance, but that’s not the case. In this blog post I want to show how visuals that do not display any data from your dataset can have a significant impact on report performance. Before we carry on I suggest you read the series of posts I wrote late last year on measuring the performance of reports in the browser using Chrome/Edge DevTools (part 1 is here, part 2 is here, part 3 is here) because I’ll be using techniques described in these posts in my testing.

I have a Power BI dataset with a single table in it and that table contains one column and one row:

Here’s a report page with a single card on it that displays that single value:

As you might think, this page is very quick to render. Using a secure embed link and the Lighthouse tab in DevTools, the Time To Interactive reading on a cold cache is 3.7 seconds (which, remember, is a worst-case test and slower than most users will experience).

Now, consider a second report page in the same pbix file with the same card visual on it, but this time with 260 rectangle visuals added:

These rectangle visuals don’t display any data. The Time To Interactive reading for this page is now a whopping 24.3 seconds! What’s more, Power BI Desktop is noticeably more sluggish when editing this page.

OK, this is an extreme example but it does go to show how adding too many visuals to your report pages, even when those visuals do not display any data, can have a noticeable impact on report performance. What if you need to use lots of shape visuals to provide visual grouping in your report though? The easy solution is not to use lots and lots of shapes but to create an image with the outlines you want and set that as the page background. It can be made to look exactly the same but in this case, using an image instead of all those shapes results in a page that has a Time To Interactive reading of 4.5 seconds – only a bit slower than the original page. Most users won’t be able to spot the difference either:

Nothing of what I have said so far is particularly new or ground-breaking, and indeed another one of my colleagues, Chris Hamill, wrote about this last year in a great post on how he uses PowerPoint to design backgrounds for Power BI reports. There is one last question that needs to be considered though, and it’s the question I was confronted with this week during the load testing: how do you know if you have this problem without completely redesigning your reports? Even if you don’t think you have a particularly cluttered report, is there still an opportunity to shave a second off your report load time by removing visuals? The information you see on the Network tab in DevTools can give you a hint.

Here’s what the Network tab (see this post for how to use it) in DevTools shows for the first report page above with only the card visual on it:

Focusing on the events in the 1.5 second period after the red line shown in the timeline at the top (which is the load event for the page) and before the first of the querydata requests (which are fired when the visuals request data from the dataset, one of the last things that happen in a report render), you’ll notice there are no significant gaps.

Compare this with the same part of the Network waterfall, a period of almost 14 seconds, for the version of the page with the 260 rectangles on:

Even without seeing the details, one thing is obvious: there are some large gaps where there is no network activity going on. These gaps are where JavaScript processing is happening on the client. The more visuals on your report, the longer the JavaScript processing is likely to take. Other factors can have a big impact on how long this takes too: the browser you use (new Edge and Chrome are going to be a lot faster than Internet Explorer), the spec of your PC and other resource-hungry processes running on your PC (such as other browser instances during a load test) can also have an effect. Therefore, if you see gaps on the Network tab, you might be able to improve the performance of your report by reducing the number of visuals and/or using a more modern browser and/or upgrading your PC. However, I must point out the behaviour of the Power BI web front end is intentionally not documented and could change at any time so be careful how you interpret anything you see in DevTools.

Optimising The Performance Of Power Query Merges In Power BI, Part 5: Cross Joins

In a late addition to the series of posts that I started here a few months ago, I thought it would be useful to look at how to optimise the performance of a rare (but nonetheless useful) type of merge: the cross join. Cross joins are not available as a join kind in the Table.Join or Table.NestedJoin M functions but there are two ways of creating them in Power Query:

  1. Creating a custom column in your query that contains the query that you want to cross join with and then expanding, as Matt Allington describes here
  2. Creating a custom column that contains a single value, such as the number 1, on both your queries and then using that column to merge the two queries on (something I mentioned a long time ago here)

Which one gives you the best performance? Apart from my old blog post there’s at least one other post suggesting the second approach is much faster, but I thought it would be useful to do some proper testing.

For my source data I created a query called SourceQuery that returned a table with one thousand rows, containing all the integers from 1 to 1000:

To test the first approach, I created a second query that referenced SourceQuery, then added a custom column that also returned the SourceQuery query:

…and then expanded this new custom column. The result was a query that returned a million rows – the cross join of the thousand-row SourceQuery table with itself.

Here’s the M code:

let
  Source = SourceQuery,
  #"Added Custom" = Table.AddColumn(
      Source, 
      "Custom", 
      each SourceQuery
    ),
  #"Expanded Custom" = Table.ExpandTableColumn(
      #"Added Custom", 
      "Custom", 
      {"Column1"}, 
      {"Custom.Column1"}
    ),
  #"Changed Type" = Table.TransformColumnTypes(
      #"Expanded Custom", 
      {{"Custom.Column1", Int64.Type}}
    )
in
  #"Changed Type"

The two timings from Profiler (the same ones that I have used to measure refresh performance throughout this series) were:

  • Progress Report End/25 Execute SQL – 0.03 seconds
  • Progress Report End/17 Read Data – 3.80 seconds

To test the second approach I added a custom column to the SourceQuery query that contained the value 1:

…and then created a new query that joined this query to itself:

…and then did an expand to achieve the same cross joined output as in the first test. Here’s the M code (all created in the Power Query Editor – no custom M code required):

let
  Source = Table.NestedJoin(
      #"SourceQuery With Join Column", 
      {"JoinColumn"}, 
      #"SourceQuery With Join Column", 
      {"JoinColumn"}, 
      "SourceQuery With Join Column", 
      JoinKind.Inner
    ),
  #"Expanded SourceQuery With Join Column"
    = Table.ExpandTableColumn(
        Source, 
        "SourceQuery With Join Column", 
        {"Column1"}, 
        {"SourceQuery With Join Column.Column1"}
      ),
  #"Removed Columns" = Table.RemoveColumns(
      #"Expanded SourceQuery With Join Column", 
      {"JoinColumn"}
    )
in
  #"Removed Columns"

The performance of this query was much faster:

  • Progress Report End/25 Execute SQL – 0.03 seconds
  • Progress Report End/17 Read Data – 0.80 seconds

So, as you can see, the second approach is the one to use.

There’s another advantage of this second approach too, if you’re using a foldable data source like SQL Server: it is possible to make query folding happen, which is of course incredibly important from a performance point of view, although you have to be careful not to change the data type of your custom column (or at least be careful how you do it). For example, here’s the M for a query that gets the cross join of the DimScenario table from the Adventure Works DW database with itself:

let
  Source = Sql.Databases("localhost"),
  AdventureWorksDW2017 = Source{[Name
    = "AdventureWorksDW2017"]}[Data],
  dbo_DimScenario = AdventureWorksDW2017{[
    Schema = "dbo", 
    Item = "DimScenario"
  ]}[Data],
  #"Removed Columns" = Table.RemoveColumns(
      dbo_DimScenario, 
      {"FactFinance"}
    ),
  #"Added Custom" = Table.AddColumn(
      #"Removed Columns", 
      "Custom", 
      each 1
    ),
  #"Merged Queries" = Table.NestedJoin(
      #"Added Custom", 
      {"Custom"}, 
      #"Added Custom", 
      {"Custom"}, 
      "Added Custom", 
      JoinKind.Inner
    ),
  #"Expanded Added Custom"
    = Table.ExpandTableColumn(
        #"Merged Queries", 
        "Added Custom", 
        {"ScenarioKey", "ScenarioName", "Custom"}, 
        {
          "Added Custom.ScenarioKey", 
          "Added Custom.ScenarioName", 
          "Added Custom.Custom"
        }
      ),
  #"Removed Other Columns" = Table.SelectColumns(
      #"Expanded Added Custom", 
      {
        "ScenarioName", 
        "Added Custom.ScenarioName"
      }
    )
in
  #"Removed Other Columns"

Here’s the SQL code generated by this query:

select [$Outer].[ScenarioName] as [ScenarioName],
    [$Inner].[ScenarioName2] as [Added Custom.ScenarioName]
from 
(
    select [_].[ScenarioKey] as [ScenarioKey],
        [_].[ScenarioName] as [ScenarioName],
        1 as [Custom]
    from [dbo].[DimScenario] as [_]
) as [$Outer]
inner join 
(
    select [_].[ScenarioKey] as [ScenarioKey2],
        [_].[ScenarioName] as [ScenarioName2],
        1 as [Custom2]
    from [dbo].[DimScenario] as [_]
) as [$Inner] on ([$Outer].[Custom] = [$Inner].[Custom2])

Displaying Routes On A Map In Power BI

In last week’s post I described the new Power Query M functions for working with Well Known Text objects, and in a post a few weeks ago I showed how you can use the Icon Map custom visual to display Well Known Text data. In this post I’ll show you how you can put all this together to do something really useful: display routes on a map in a Power BI report.

First of all you’ll need to download the latest version of the Icon Map custom visual here, because at the time of writing the version in AppSource doesn’t have the WKT functionality needed. Second, you’ll need a data source with a table of latitudes and longitudes that represent points on a route that you want to plot. I’m going to use the Azure Maps Get Route Directions API as my data source; to keep things simple I’m going to use the Shared Key authentication method for this API, but I’ll explain below why this can be a problem in the real world. To follow the example you’ll need to create an Azure Maps Account in the Azure Portal so you can get your own key.

Here’s a simple example to start off with. The following M code:

let
  Source = Json.Document(
      Web.Contents(
          "https://atlas.microsoft.com/route/directions/json", 
          [ApiKeyName = "subscription-key", Query
            = [
            #"api-version" = "1.0", 
            query
              = "51.502243,-0.140073:51.517777,-0.138621", 
            travelMode = "pedestrian"
          ]]
        )
    ),
  routes = Source[routes],
  routes1 = routes{0},
  legs = routes1[legs],
  legs1 = legs{0},
  points = legs1[points],
  PointsList = List.Transform(
      points, 
      each GeographyPoint.From(
          [longitude], 
          [latitude]
        )
    ),
  LineRecord = [Kind = "LINESTRING", Points
    = PointsList],
  WKTLineString = Geography.ToWellKnownText(
      LineRecord
    ),
  ToTable = #table(
      type table[
        Category = number, 
        WKT = text, 
        Size = number
      ], 
      {{1, WKTLineString, 1}}
    )
in
  ToTable

…calls the Azure Maps Get Route Directions API to get a walking route between two locations in central London. When you run the query you’ll be prompted to authenticate – choose the Web API option and enter your key. It’s a very basic example of how the API can be used: I have assumed there’s only one leg in the route, then taken the list of latitudes and longitudes for that single leg, turned that into a list of points and then turned that list into a linestring as I showed in last week’s blog post. Here’s the output in the Icon Map visual:

This is what the table returned by the query looks like:

The Category and Size columns aren’t very interesting; Icon Map needs these columns to display data. The WKT column contains the linestring object. Here’s how the visual is set up in Power BI:

We can do better than this though. In the example pbix file I’ve uploaded here, I’ve created a more sophisticated query that asks for six different routes between the two locations used above that all go via one waypoint (so there are two legs in the route data returned). I won’t go through the M code because there’s nothing interesting in it, but the end result is a report that allows you to plot these six different routes in different colours on a map and use a slicer to hide and show them, along with some information on their length and travel time:

One last thing to point out regarding these examples is that they won’t refresh if you publish them to the Power BI Service because I’ve used the ApiKeyName option in Web.Contents to handle key authentication (see here) and that isn’t supported in the Service yet. To work around this you’ll need to hard code the Azure Maps API key in your query and use the Skip Test Connection option in the Power BI Service.

Power Query Geography And Geometry Functions In Power BI And Excel

In the August 2020 release of Power BI Desktop a couple of new Power Query functions were added: Geography.FromWellKnownText, Geography.ToWellKnownText, GeographyPoint.From, Geometry.FromWellKnownText, Geometry.ToWellKnownText and GeometryPoint.From. These functions (which are coming soon to Power Query in Excel too), make it easier to work with geographic and geometric data in the Well Known Text format. You can have all kinds of fun with these functions if you have a visual (like the Icon Map custom visual) that can display Well Known Text data, but I’ll leave that kind of thing for future blog posts. In this post I’ll explain how the basics of how the functions actually work.

Let’s start with points. The Geography.FromWellKnownText and Geometry.FromWellKnownText functions convert Well Known Text values into records, which makes it simple to do things like extract latitude and longitude values without having to do complex parsing of the text itself. For example:

Geometry.FromWellKnownText("POINT (0.1 0.2)")

Returns a record that looks like this:

Whereas the following:

Geography.FromWellKnownText("POINT (0.1 0.2)")

…returns a similar record but one with fields for latitude and longitude rather than x and y:

Now you know what the record format of a point looks like it’s easy to go in the other direction and convert a record into Well Known Text format. For example the M expression:

Geography.ToWellKnownText([Kind="POINT", Longitude=0.1, Latitude=0.3])

returns the text value

POINT(0.1 0.3)

You can also generate a point by passing a longitude and latitude (and other optional parameters) to GeographyPoint.From and GeometryPoint.From. For example:

GeographyPoint.From(0.1, 0.3)

…also returns a POINT record:

Points can then be used to build more complex objects, which in turn can be combined into more complex objects still. For example the following M query takes three points and creates a LINESTRING object that joins those three points into a line:

let
    Point1 = GeographyPoint.From(0,0),
    Point2 = GeographyPoint.From(1,1),
    Point3 = GeographyPoint.From(0,1),
    PointList = {Point1,Point2,Point3},
    LineRecord = [Kind="LINESTRING", Points=PointList],
    WKTLineString = Geography.ToWellKnownText(LineRecord)
in
    WKTLineString

The output is this:

LINESTRING(0 0, 1 1, 0 1)

Similarly, multiple LINESTRING objects can be combined into a MULTILINESTRING. For example:

let
    Point1 = GeographyPoint.From(0,0),
    Point2 = GeographyPoint.From(1,1),
    PointList1 = {Point1,Point2},
    Point3 = GeographyPoint.From(0,1),
    Point4 = GeographyPoint.From(1,0),
    PointList2 = {Point3, Point4},
    LineRecord1 = [Kind="LINESTRING", Points=PointList1],
    LineRecord2 = [Kind="LINESTRING", Points=PointList2],
    LineRecordList = {LineRecord1,LineRecord2},
    MultiLineRecord = [Kind="MULTILINESTRING", Components=LineRecordList],
    WKTMultiLineString = Geography.ToWellKnownText(MultiLineRecord)
in
    WKTMultiLineString

…returns:

MULTILINESTRING((0 0, 1 1), (0 1, 1 0))

Finally, the most useful application for this is to extract latitudes and longitudes or x and y co-ordinates from a column of values in WKT format, something like this:

All you need to do to extract latitude and longitude values from these points is add a custom column as a step with the expression

Geography.FromWellKnownText([MyPoints])

The result is a column of records which can be expanded to get latitude and longitude values:

%d bloggers like this: