Power BI Report Performance And The Number Of Visuals On A Page

When you’re faced with a slow Power BI report it’s very easy to assume that the problem is something to do with the dataset, how it’s modelled and how the DAX has been written, and then disappear down a rabbit hole of Marco-and-Alberto videos trying to shave milliseconds of the time taken by each DAX query the report runs. It’s certainly an assumption that I make all the time, and indeed most performance problems can be fixed in this way. However this can mean that you miss other important ways of improving report performance.

Two things happened last week to remind me of this. First of all I spent some time with my colleague Miguel Myers (check out his YouTube channel) who showed me how he had tuned a customer’s report and reduced one page’s time from 27 seconds to 10 seconds without making any changes to the dataset or DAX while still showing the same information on the page. He did this partly by using a smaller number of visuals to display the same values – something I blogged about here when I learned this from him earlier this year – but mostly by reducing the number of visuals like shapes and textboxes that don’t display any data at all. Secondly, I was helping some other colleagues with load testing a Power BI report using this tool, and trying to understand whether there was a performance bottleneck inside Power BI or on the client machines running the load test; it turned out that it was the latter problem.

You may have read the title of this post and guessed that I’m going to talk about reducing the number of visuals that display data from your dataset as a way of improving performance, but that’s not the case. In this blog post I want to show how visuals that do not display any data from your dataset can have a significant impact on report performance. Before we carry on I suggest you read the series of posts I wrote late last year on measuring the performance of reports in the browser using Chrome/Edge DevTools (part 1 is here, part 2 is here, part 3 is here) because I’ll be using techniques described in these posts in my testing.

I have a Power BI dataset with a single table in it and that table contains one column and one row:

Here’s a report page with a single card on it that displays that single value:

As you might think, this page is very quick to render. Using a secure embed link and the Lighthouse tab in DevTools, the Time To Interactive reading on a cold cache is 3.7 seconds (which, remember, is a worst-case test and slower than most users will experience).

Now, consider a second report page in the same pbix file with the same card visual on it, but this time with 260 rectangle visuals added:

These rectangle visuals don’t display any data. The Time To Interactive reading for this page is now a whopping 24.3 seconds! What’s more, Power BI Desktop is noticeably more sluggish when editing this page.

OK, this is an extreme example but it does go to show how adding too many visuals to your report pages, even when those visuals do not display any data, can have a noticeable impact on report performance. What if you need to use lots of shape visuals to provide visual grouping in your report though? The easy solution is not to use lots and lots of shapes but to create an image with the outlines you want and set that as the page background. It can be made to look exactly the same but in this case, using an image instead of all those shapes results in a page that has a Time To Interactive reading of 4.5 seconds – only a bit slower than the original page. Most users won’t be able to spot the difference either:

Nothing of what I have said so far is particularly new or ground-breaking, and indeed another one of my colleagues, Chris Hamill, wrote about this last year in a great post on how he uses PowerPoint to design backgrounds for Power BI reports. There is one last question that needs to be considered though, and it’s the question I was confronted with this week during the load testing: how do you know if you have this problem without completely redesigning your reports? Even if you don’t think you have a particularly cluttered report, is there still an opportunity to shave a second off your report load time by removing visuals? The information you see on the Network tab in DevTools can give you a hint.

Here’s what the Network tab (see this post for how to use it) in DevTools shows for the first report page above with only the card visual on it:

Focusing on the events in the 1.5 second period after the red line shown in the timeline at the top (which is the load event for the page) and before the first of the querydata requests (which are fired when the visuals request data from the dataset, one of the last things that happen in a report render), you’ll notice there are no significant gaps.

Compare this with the same part of the Network waterfall, a period of almost 14 seconds, for the version of the page with the 260 rectangles on:

Even without seeing the details, one thing is obvious: there are some large gaps where there is no network activity going on. These gaps are where JavaScript processing is happening on the client. The more visuals on your report, the longer the JavaScript processing is likely to take. Other factors can have a big impact on how long this takes too: the browser you use (new Edge and Chrome are going to be a lot faster than Internet Explorer), the spec of your PC and other resource-hungry processes running on your PC (such as other browser instances during a load test) can also have an effect. Therefore, if you see gaps on the Network tab, you might be able to improve the performance of your report by reducing the number of visuals and/or using a more modern browser and/or upgrading your PC. However, I must point out the behaviour of the Power BI web front end is intentionally not documented and could change at any time so be careful how you interpret anything you see in DevTools.

Viewing Data Privacy Partition Information With Power Query Query Diagnostics

Back in May 2020 a new feature was added to Power BI’s Power Query Query Diagnostics functionality: the ability to view data privacy partition information. The announcement blog post has some basic information on this feature but in this post I’ll go into a bit more detail about how it actually works.

If you want a refresher on the subject of Power Query data privacy then this video is a good place to start. The Power Query documentation also has a detailed article on what data privacy partitions are.

Let’s see how Query Diagnostics displays partition information using a simple example. First of all, here’s a query called DayName that reads data from an Excel file that looks like this:

and returns the text of the day name shown:

let
  Source = Excel.Workbook(
      File.Contents("C:\SelectedDay.xlsx"), 
      null, 
      true
    ),
  DayName_Table
    = Source{[Item = "DayName", Kind = "Table"]}[Data],
  #"Changed Type" = Table.TransformColumnTypes(
      DayName_Table, 
      {{"Day Name", type text}}
    ),
  #"Day Name" = #"Changed Type"{0}[Day Name]
in
  #"Day Name"

Secondly, here’s a query that gets data from the DimDate table in the Adventure Works DW in SQL Server and filters the EnglishDayNameOfWeek column by the day name returned from the Excel workbook:

let
  Source = Sql.Databases("localhost"),
  AdventureWorksDW2017 = Source{[Name
    = "AdventureWorksDW2017"]}[Data],
  dbo_DimDate = AdventureWorksDW2017{[
    Schema = "dbo", 
    Item = "DimDate"
  ]}[Data],
  #"Filtered Rows" = Table.SelectRows(
      dbo_DimDate, 
      each ([EnglishDayNameOfWeek] = DayName)
    )
in
  #"Filtered Rows"

Both the Excel workbook and the SQL Server database data sources have their privacy level set to Public.

With the Query Diagnostics options set to record data privacy partition information:

…you can click on Diagnose Step for the last step of the second query above and you’ll get a diagnostics query that returns the following:

There are three rows in this query, one for each partition (only the leftmost columns are shown here). Each partition is identified by the values in the Partition Key column. The second and third rows represent the partitions that access the SQL Server database and the Excel workbook respectively. You can see there are list values in the Accessed Resources column for these rows: these are lists of the data sources accessed by these partitions. For example here’s what the list contains for the second row above:

The first row represents the partition where the data from the DimDate table is filtered by the day name from Excel. This row has a null value for Accessed Resources because it doesn’t access any data sources directly; it does have a value in the Partition Inputs column, another list, that returns the keys of the partitions that feed data into this partition. Here’s what that list looks like:

The values in the Firewall Group column tell you something useful about the data sources used in each partition. As the article on partitioning explains, when data passes from one partition to another, it may be buffered in memory – which may then have an impact on performance. Certain types of data source such as local files and folders are considered trusted, which means that they won’t be buffered even in the presence of Private data so long as all the data sources in the partition are trusted too.

Scrolling to the right-hand side of the query shows information on the amount of time taken to evaluate each partition, while the Diagnostics column contains a nested table that has the rows from the Detailed diagnostics query but just filtered by partition:

Indeed, if you look at the Detailed and Aggregated query diagnostics queries you’ll also see a Partition Key column, which not only means that you can see partition information when looking at all the diagnostics information for your query, it also means that in situations where the Partitions query diagnostics query doesn’t appear (which happens…) you can still see useful information about data privacy.

Power Query data privacy is one of the most difficult subjects in the whole of Power BI, but it’s also really important that you understand it: apart from the performance implications it can determine whether your query even runs or not. Having information like this in Query Diagnostics means that for the first time we have detailed information about how data privacy rules are applied in a particular query.

The M Behind The New Power Query Data Types In Excel

The big news this week – at least for me – was the release of the new Power Query data types to the Excel insiders channel. You can read all about it here:

https://insider.office.com/en-us/blog/power-query-data-types-in-excel

They’re the latest manifestation of Excel linked data types; cool things are also happening with them and Power BI featured tables too.

The announcement blog post explains pretty much everything you can do right now with Power Query data types but I was curious about the M code that is used to create them. Here’s an example query that takes this source table:

…and creates this Power Query data type:

let
  Source = #table(
      type table[
        Fruit = text, 
        Colour = text, 
        Sales = number
      ], 
      {
        {"Apples", "Green", 10}, 
        {"Lemons", "Yellow", 20}, 
        {"Strawberries", "Red", 30}
      }
    ),
  #"Created data type"
    = Table.CombineColumnsToRecord(
        Source, 
        "Data type", 
        {"Fruit", "Colour", "Sales"}, 
        [DisplayNameColumn = "Fruit", TypeName
          = "Excel.DataType"]
      )
in
  #"Created data type"

The magic happens with the #”Created data type” step and the Table.CombineColumnsToRecord function; so Power Query data types are basically columns that contain record values with (I guess, I need to check) some extra metadata.

Optimising The Performance Of Power Query Merges In Power BI, Part 5: Cross Joins

In a late addition to the series of posts that I started here a few months ago, I thought it would be useful to look at how to optimise the performance of a rare (but nonetheless useful) type of merge: the cross join. Cross joins are not available as a join kind in the Table.Join or Table.NestedJoin M functions but there are two ways of creating them in Power Query:

  1. Creating a custom column in your query that contains the query that you want to cross join with and then expanding, as Matt Allington describes here
  2. Creating a custom column that contains a single value, such as the number 1, on both your queries and then using that column to merge the two queries on (something I mentioned a long time ago here)

Which one gives you the best performance? Apart from my old blog post there’s at least one other post suggesting the second approach is much faster, but I thought it would be useful to do some proper testing.

For my source data I created a query called SourceQuery that returned a table with one thousand rows, containing all the integers from 1 to 1000:

To test the first approach, I created a second query that referenced SourceQuery, then added a custom column that also returned the SourceQuery query:

…and then expanded this new custom column. The result was a query that returned a million rows – the cross join of the thousand-row SourceQuery table with itself.

Here’s the M code:

let
  Source = SourceQuery,
  #"Added Custom" = Table.AddColumn(
      Source, 
      "Custom", 
      each SourceQuery
    ),
  #"Expanded Custom" = Table.ExpandTableColumn(
      #"Added Custom", 
      "Custom", 
      {"Column1"}, 
      {"Custom.Column1"}
    ),
  #"Changed Type" = Table.TransformColumnTypes(
      #"Expanded Custom", 
      {{"Custom.Column1", Int64.Type}}
    )
in
  #"Changed Type"

The two timings from Profiler (the same ones that I have used to measure refresh performance throughout this series) were:

  • Progress Report End/25 Execute SQL – 0.03 seconds
  • Progress Report End/17 Read Data – 3.80 seconds

To test the second approach I added a custom column to the SourceQuery query that contained the value 1:

…and then created a new query that joined this query to itself:

…and then did an expand to achieve the same cross joined output as in the first test. Here’s the M code (all created in the Power Query Editor – no custom M code required):

let
  Source = Table.NestedJoin(
      #"SourceQuery With Join Column", 
      {"JoinColumn"}, 
      #"SourceQuery With Join Column", 
      {"JoinColumn"}, 
      "SourceQuery With Join Column", 
      JoinKind.Inner
    ),
  #"Expanded SourceQuery With Join Column"
    = Table.ExpandTableColumn(
        Source, 
        "SourceQuery With Join Column", 
        {"Column1"}, 
        {"SourceQuery With Join Column.Column1"}
      ),
  #"Removed Columns" = Table.RemoveColumns(
      #"Expanded SourceQuery With Join Column", 
      {"JoinColumn"}
    )
in
  #"Removed Columns"

The performance of this query was much faster:

  • Progress Report End/25 Execute SQL – 0.03 seconds
  • Progress Report End/17 Read Data – 0.80 seconds

So, as you can see, the second approach is the one to use.

There’s another advantage of this second approach too, if you’re using a foldable data source like SQL Server: it is possible to make query folding happen, which is of course incredibly important from a performance point of view, although you have to be careful not to change the data type of your custom column (or at least be careful how you do it). For example, here’s the M for a query that gets the cross join of the DimScenario table from the Adventure Works DW database with itself:

let
  Source = Sql.Databases("localhost"),
  AdventureWorksDW2017 = Source{[Name
    = "AdventureWorksDW2017"]}[Data],
  dbo_DimScenario = AdventureWorksDW2017{[
    Schema = "dbo", 
    Item = "DimScenario"
  ]}[Data],
  #"Removed Columns" = Table.RemoveColumns(
      dbo_DimScenario, 
      {"FactFinance"}
    ),
  #"Added Custom" = Table.AddColumn(
      #"Removed Columns", 
      "Custom", 
      each 1
    ),
  #"Merged Queries" = Table.NestedJoin(
      #"Added Custom", 
      {"Custom"}, 
      #"Added Custom", 
      {"Custom"}, 
      "Added Custom", 
      JoinKind.Inner
    ),
  #"Expanded Added Custom"
    = Table.ExpandTableColumn(
        #"Merged Queries", 
        "Added Custom", 
        {"ScenarioKey", "ScenarioName", "Custom"}, 
        {
          "Added Custom.ScenarioKey", 
          "Added Custom.ScenarioName", 
          "Added Custom.Custom"
        }
      ),
  #"Removed Other Columns" = Table.SelectColumns(
      #"Expanded Added Custom", 
      {
        "ScenarioName", 
        "Added Custom.ScenarioName"
      }
    )
in
  #"Removed Other Columns"

Here’s the SQL code generated by this query:

select [$Outer].[ScenarioName] as [ScenarioName],
    [$Inner].[ScenarioName2] as [Added Custom.ScenarioName]
from 
(
    select [_].[ScenarioKey] as [ScenarioKey],
        [_].[ScenarioName] as [ScenarioName],
        1 as [Custom]
    from [dbo].[DimScenario] as [_]
) as [$Outer]
inner join 
(
    select [_].[ScenarioKey] as [ScenarioKey2],
        [_].[ScenarioName] as [ScenarioName2],
        1 as [Custom2]
    from [dbo].[DimScenario] as [_]
) as [$Inner] on ([$Outer].[Custom] = [$Inner].[Custom2])

Displaying Routes On A Map In Power BI

In last week’s post I described the new Power Query M functions for working with Well Known Text objects, and in a post a few weeks ago I showed how you can use the Icon Map custom visual to display Well Known Text data. In this post I’ll show you how you can put all this together to do something really useful: display routes on a map in a Power BI report.

First of all you’ll need to download the latest version of the Icon Map custom visual here, because at the time of writing the version in AppSource doesn’t have the WKT functionality needed. Second, you’ll need a data source with a table of latitudes and longitudes that represent points on a route that you want to plot. I’m going to use the Azure Maps Get Route Directions API as my data source; to keep things simple I’m going to use the Shared Key authentication method for this API, but I’ll explain below why this can be a problem in the real world. To follow the example you’ll need to create an Azure Maps Account in the Azure Portal so you can get your own key.

Here’s a simple example to start off with. The following M code:

let
  Source = Json.Document(
      Web.Contents(
          "https://atlas.microsoft.com/route/directions/json", 
          [ApiKeyName = "subscription-key", Query
            = [
            #"api-version" = "1.0", 
            query
              = "51.502243,-0.140073:51.517777,-0.138621", 
            travelMode = "pedestrian"
          ]]
        )
    ),
  routes = Source[routes],
  routes1 = routes{0},
  legs = routes1[legs],
  legs1 = legs{0},
  points = legs1[points],
  PointsList = List.Transform(
      points, 
      each GeographyPoint.From(
          [longitude], 
          [latitude]
        )
    ),
  LineRecord = [Kind = "LINESTRING", Points
    = PointsList],
  WKTLineString = Geography.ToWellKnownText(
      LineRecord
    ),
  ToTable = #table(
      type table[
        Category = number, 
        WKT = text, 
        Size = number
      ], 
      {{1, WKTLineString, 1}}
    )
in
  ToTable

…calls the Azure Maps Get Route Directions API to get a walking route between two locations in central London. When you run the query you’ll be prompted to authenticate – choose the Web API option and enter your key. It’s a very basic example of how the API can be used: I have assumed there’s only one leg in the route, then taken the list of latitudes and longitudes for that single leg, turned that into a list of points and then turned that list into a linestring as I showed in last week’s blog post. Here’s the output in the Icon Map visual:

This is what the table returned by the query looks like:

The Category and Size columns aren’t very interesting; Icon Map needs these columns to display data. The WKT column contains the linestring object. Here’s how the visual is set up in Power BI:

We can do better than this though. In the example pbix file I’ve uploaded here, I’ve created a more sophisticated query that asks for six different routes between the two locations used above that all go via one waypoint (so there are two legs in the route data returned). I won’t go through the M code because there’s nothing interesting in it, but the end result is a report that allows you to plot these six different routes in different colours on a map and use a slicer to hide and show them, along with some information on their length and travel time:

One last thing to point out regarding these examples is that they won’t refresh if you publish them to the Power BI Service because I’ve used the ApiKeyName option in Web.Contents to handle key authentication (see here) and that isn’t supported in the Service yet. To work around this you’ll need to hard code the Azure Maps API key in your query and use the Skip Test Connection option in the Power BI Service.

Power Query Geography And Geometry Functions In Power BI And Excel

In the August 2020 release of Power BI Desktop a couple of new Power Query functions were added: Geography.FromWellKnownText, Geography.ToWellKnownText, GeographyPoint.From, Geometry.FromWellKnownText, Geometry.ToWellKnownText and GeometryPoint.From. These functions (which are coming soon to Power Query in Excel too), make it easier to work with geographic and geometric data in the Well Known Text format. You can have all kinds of fun with these functions if you have a visual (like the Icon Map custom visual) that can display Well Known Text data, but I’ll leave that kind of thing for future blog posts. In this post I’ll explain how the basics of how the functions actually work.

Let’s start with points. The Geography.FromWellKnownText and Geometry.FromWellKnownText functions convert Well Known Text values into records, which makes it simple to do things like extract latitude and longitude values without having to do complex parsing of the text itself. For example:

Geometry.FromWellKnownText("POINT (0.1 0.2)")

Returns a record that looks like this:

Whereas the following:

Geography.FromWellKnownText("POINT (0.1 0.2)")

…returns a similar record but one with fields for latitude and longitude rather than x and y:

Now you know what the record format of a point looks like it’s easy to go in the other direction and convert a record into Well Known Text format. For example the M expression:

Geography.ToWellKnownText([Kind="POINT", Longitude=0.1, Latitude=0.3])

returns the text value

POINT(0.1 0.3)

You can also generate a point by passing a longitude and latitude (and other optional parameters) to GeographyPoint.From and GeometryPoint.From. For example:

GeographyPoint.From(0.1, 0.3)

…also returns a POINT record:

Points can then be used to build more complex objects, which in turn can be combined into more complex objects still. For example the following M query takes three points and creates a LINESTRING object that joins those three points into a line:

let
    Point1 = GeographyPoint.From(0,0),
    Point2 = GeographyPoint.From(1,1),
    Point3 = GeographyPoint.From(0,1),
    PointList = {Point1,Point2,Point3},
    LineRecord = [Kind="LINESTRING", Points=PointList],
    WKTLineString = Geography.ToWellKnownText(LineRecord)
in
    WKTLineString

The output is this:

LINESTRING(0 0, 1 1, 0 1)

Similarly, multiple LINESTRING objects can be combined into a MULTILINESTRING. For example:

let
    Point1 = GeographyPoint.From(0,0),
    Point2 = GeographyPoint.From(1,1),
    PointList1 = {Point1,Point2},
    Point3 = GeographyPoint.From(0,1),
    Point4 = GeographyPoint.From(1,0),
    PointList2 = {Point3, Point4},
    LineRecord1 = [Kind="LINESTRING", Points=PointList1],
    LineRecord2 = [Kind="LINESTRING", Points=PointList2],
    LineRecordList = {LineRecord1,LineRecord2},
    MultiLineRecord = [Kind="MULTILINESTRING", Components=LineRecordList],
    WKTMultiLineString = Geography.ToWellKnownText(MultiLineRecord)
in
    WKTMultiLineString

…returns:

MULTILINESTRING((0 0, 1 1), (0 1, 1 0))

Finally, the most useful application for this is to extract latitudes and longitudes or x and y co-ordinates from a column of values in WKT format, something like this:

All you need to do to extract latitude and longitude values from these points is add a custom column as a step with the expression

Geography.FromWellKnownText([MyPoints])

The result is a column of records which can be expanded to get latitude and longitude values:

The Pros And Cons Of Modelling Measures As A Dimension In Power BI

You probably know that dimensional modelling and building a star schema are very important in Power BI. In a star schema you’ll have at least one fact table, and in your fact table you’ll have two main types of column: dimension keys, which link the fact table to dimension tables, and measures, which store the data you want to aggregate and analyse. Here’s a simple example of a fact table called Sales with three dimension key columns (Product, Store and Customer) and three measure columns (Sales, Tax and Volume Sold):

Quite often, though, I see people taking a slightly different approach to modelling their fact tables: instead of having separate columns for each measure they unpivot their data, create one row in their fact table for each measure value, use a single column to store all the measure values and create a new dimension to allow the user to select which measure values they want. Here’s an example of another fact table, called Sales Unpivot, showing how the data from the Sales fact table above can be remodelled using this technique:

In this fact table the dimension keys remain the same, but the Value column stores all the data from the Sales, Tax and Volume Sold measures in the original table and the Measure Name column tells you what type of measure value is stored on any given row. Let’s call this approach the Measures Dimension approach.

There are some advantages to building fact tables using the Measures Dimension approach, for example:

  • You can now use a slicer in a report to select the measures that appear in a visual
  • You can now easily add new measures without having to add new columns in your fact table
  • You can use row-level security to control which measures a user has access to

Generally speaking, though, any time you deviate from a conventional dimensional model you risk running into problems later on and this is no exception. Let’s go through the disadvantages of modelling data using a Measures Dimension.

Formatting

Notice that the Sales and Tax measure columns from the Sales fact table are currency values and that Volumn Sold contains decimal values with four decimal places. It’s quite easy to set different formats for different measures when each measure is a separate column:

However, when all your values are stored in one column, as in the Measures Dimension example, formatting is not so straightforward. You might be able to get away with using one generic format for all your data:

…but that isn’t ideal. Of course you can create DAX measures and format them appropriately but then you lose some of the flexibility of this approach; you could also use a calculation group and dynamic format strings as Kasper describes here.

Compression

More seriously, Power BI does a much more efficient job of storing and compressing the data in a conventional fact table compared to when the Measures Dimension approach is used and this has consequences for query performance. Using the View Metrics button in DAX Studio to see the details of how the data is stored for each table is revealing. Here are some selected highlights:

First of all, notice that the Sales Unpivot table (which uses the Measures dimension approach) is 66% larger than the Sales table. Notice also that in the Sales table the Sales and Tax measure columns, which contain currency values, can use the Currency data type (which shows up Decimal here, confusingly) which in turn means that they can use Value encoding; only the Volume Sold column needs to be stored using the Decimal Number data type (which shows up as Double here), and must use Hash encoding. In the Sales Unpivot table, since all the measure values are stored in the Value column, this column has to use the Decimal Number data type and Hash encoding. As this article explains (the Definitive Guide To DAX goes into a lot more detail) Value encoding can give you a lot of performance benefits.

Calculation Complexity

When you start to build more complex DAX calculations then the disadvantages of the Measures Dimension approach become even more apparent. Let’s say you want a visual in your report that shows Sales, Tax and a measure that subtracts Tax from Sales called Sales After Tax:

Here’s the DAX needed for this visual:

Sales Measure = SUM('Sales'[Sales])
Tax Measure = SUM('Sales'[Tax])
Sales After Tax = [Sales Measure] - [Tax Measure]

To achieve the same result with the Measures Dimension approach, though, you need to know how to use the DAX Calculate() function, something like this:

Sales Measure 2 =
CALCULATE (
    SUM ( 'Sales Unpivot'[Value] ),
    KEEPFILTERS ( 'Sales Unpivot'[Measure Name] = "Sales" )
)
Tax Measure 2 =
CALCULATE (
    SUM ( 'Sales Unpivot'[Value] ),
    KEEPFILTERS ( 'Sales Unpivot'[Measure Name] = "Tax" )
)
Sales After Tax 2 = [Sales Measure 2] - [Tax Measure 2]

[Note that in most cases I’d create a separate dimension table for the Measures dimension, but to keep things simple here I’ve not done that]

If you expect other people to build measures on your dataset then this additional complexity can be a significant barrier to overcome. Calculate isn’t an easy function to use properly.

Calculation Performance

Last of all, there’s also also a performance penalty to pay with the Measures dimension. Taking the Sales After Tax example from the previous section, here’s what the Server Timings tab in DAX Studio shows for the query associated with the visual showing Sales, Tax and Sales After Tax:

Notice that there’s just one Storage Engine query: DAX fusion has kicked in so that the Sales and Tax values required can be retrieved in the same scan.

However, here’s what the Server Timings tab shows for the same visual using the Measures Dimension approach and the second set of measures using the Sales Unpivot table shown above:

Not only is this query slower but there are now two Storage Engine queries: one to get the Sales data and one to get the Tax data. Since separate scans are needed to get each measure value, the more measures you have in a visual or the more measures needed by your calculations, the more scans are needed. This can very quickly add up to a big performance problem, especially if each scan is relatively slow – which is more likely to be the case since the Measures Dimension approach means Power BI is less able to compress data effectively.

Conclusion

As you’ve probably guessed by now I’m not a big fan of the Measures Dimension approach. While there are definitely some advantages to using it I think the disadvantages – which aren’t always immediately obvious – outweigh them.

Dynamically Generated Lines On A Map In Power BI Using DAX, WKT And The Icon Map Custom Visual

I don’t generally blog about data visualisation in Power BI because it’s not my strong point, and I don’t blog about maps and geospatial analysis because I know even less about that subject and there are people like David Eldersveld who cover it so well. I do like playing around with maps though and recently I’ve been having fun with the Icon Map custom visual developed by James Dales. Probably the thing I like most about it is that so many properties accept measures as inputs, which means that you can use DAX to do some interesting things.

In this post I’ll show you a very basic example of how to display dynamically generated lines on a map – specifically, I’ll show you how to draw lines between any four cities that a user selects from a slicer. Pre-calculating all the possible combinations of cities when you load your data is not feasible because of the sheer number, so this is a scenario where being able to dynamically generate the lines between the four selected cities in DAX is really useful.

To start off, I downloaded a table with a list of cities in the UK and their latitudes and longitudes from https://simplemaps.com/data/gb-cities and then loaded this data into four identical tables in Power BI called Start, Second Stop, Third Stop and End.

I then created four slicers from the City column from these four tables, so the user can select the four different cities they want to draw lines between.

The key to making this work is to use the Icon Map visual’s ability to display geometric data in Well Known Text format; documentation on this can be found here. To achieve this I wrote some (slightly over-engineered) DAX in a measure that generates a LINESTRING object that has four lines to connect the four selected cities:

Route = 
var CityTable = 
{
    (SELECTEDVALUE('Start'[Longitude]), 
    SELECTEDVALUE('Start'[Latitude])),
    (SELECTEDVALUE('Second Stop'[Longitude]), 
    SELECTEDVALUE('Second Stop'[Latitude])),
    (SELECTEDVALUE('Third Stop'[Longitude]), 
    SELECTEDVALUE('Third Stop'[Latitude])),
    (SELECTEDVALUE('End'[Longitude]), 
    SELECTEDVALUE('End'[Latitude]))
}
var FilteredCityTable = 
FILTER(
    CityTable, 
    NOT(ISBLANK([Value1])) && 
    NOT(ISBLANK([Value2])))
return
"LINESTRING ( " & 
CONCATENATEX(
    FilteredCityTable, 
    [Value1] & " " & 
    [Value2], ", ") & 
")"

Here’s what the measure returns when the cities of Bath, Birmingham, Banbridge and Bradford are selected:

LINESTRING ( -2.365556 51.379444, -1.916667 52.466667, -6.26701 54.35091, -1.75 53.783333)

Finally, I dragged this measure in the Icon URL / WKT / SVG well in the Icon Map visual (NB: you need to use the version of the visual from http://www.icon-map.com/ and not the version from AppSource for now for this to work). Here’s the result:

You can download the sample pbix file here.

There are far more practical things you can do with this: for example, Mimoune Djouallah has a great blog post here on how to use Well Known Text in Icon Map to display 1.2 million points on a map. Let me know if you have a real-world application for this by leaving a comment.

Returning Annotations From OData Sources In Power BI/Power Query

The Power Query OData.Feed function has an option called IncludeAnnotations that allows you to return annotation values from an OData data source. It’s not obvious how to use it though – even if you use this option when connecting, you won’t see the annotation values by default because they are returned as metadata. Here’s an example of how to get annotation values using some simple M code.

The following query gets data from the statuscode column of the bookableresourcecategories table in Dynamics CRM via OData:


let
Source =
OData.Feed(
"https://xyz.crm.dynamics.com/api/data/v9.1",
null,
[Implementation = "2.0"]),
bookableresourcecategories_table =
Source{[
Name = "bookableresourcecategories",
Signature = "table"
]}[Data],
#"Removed Other Columns" =
Table.SelectColumns(
bookableresourcecategories_table,
{"statuscode"})
in
#"Removed Other Columns"

First

To get the option set labels associated with these values, you first of all need to edit the record in the third parameter of OData.Feed and use the IncludeAnnotations option to get the FormattedValue annotation like so:


OData.Feed(
"https://xyz.crm.dynamics.com/api/data/v9.1",
null,
[Implementation="2.0",
IncludeAnnotations="OData.Community.Display.V1.FormattedValue"]
)

Then you need to add a custom column in the Power Query Editor that gets the metadata from each cell in the statuscode column using the Value.Metadata function:

CustomColumn


Value.Metadata(
[statuscode]
)[OData.Community.Display.V1.FormattedValue]?

By the way, if you’re wondering what the question mark does in this expression, it stops an error occurring if there is no OData.Community.Display.V1.FormattedValue field in the metadata record; this post has more details.

The full query looks like this:


let
Source =
OData.Feed(
"https://xyz.crm.dynamics.com/api/data/v9.1",
null,
[Implementation = "2.0",
IncludeAnnotations =
"OData.Community.Display.V1.FormattedValue"]
),
bookableresourcecategories_table =
Source{
[Name = "bookableresourcecategories",
Signature = "table"]
}[Data],
#"Removed Other Columns" =
Table.SelectColumns(
bookableresourcecategories_table,
{"statuscode"}
),
#"Added Custom" =
Table.AddColumn(
#"Removed Other Columns",
"Custom",
each Value.Metadata(
[statuscode]
)[OData.Community.Display.V1.FormattedValue]?
)
in
#"Added Custom"

…and here’s the output of the query showing the option set labels in a new column:

Last

[Thanks again to Matt Masson for providing this information]

Adding Tooltips To Column Headers In The Power Query Editor In Power BI And Excel

Here’s something that will excite all you M nerds out there! Did you know you can make tooltips appear when you hover over column headers in a table in the Power Query Editor in either Power BI Desktop or Excel?

It’s all done with M metadata, and here’s an example query:


let
source =
#table({"firstname", "lastname"}, {{"matt", "masson"}}),
tableType =
type table[firstname = Text.Type, lastname = Text.Type]
meta [
Documentation.FieldDescription =
[firstname = "Given Name", lastname = "Family Name"]
],
replaceType = Value.ReplaceType(source, tableType)
in
replaceType

Here’s what you see when your mouse hovers over the firstname column in the Power Query Editor:

Capture1

…and here’s what you see when your mouse hovers over the lastname column:

Capture2

How does this work? Here’s what each of the steps do:

  • The source step creates a simple table with two columns called firstname and lastname using #table (see here for more details on that).
  • The tabletype step declares a new table type with two columns (the same two columns in the table from the previous step) and then adds a metadata record to this type. In that record the Documentation.FieldDescription field contains the text values that will appear as tooltips when you hover over each column.
  • The replacetype step replaces the type of the table returned by source with the type declared in tabletype.

The Power Query Editor UI then looks for a Documentation.FieldDescription field in any metadata associated with a table and displays the values in that field when you hover over the appropriate column.

Note that if you add any steps to your query after this that change the table type (for example that add or remove columns), the metadata is removed ☹. That said I still feel like this might be a useful feature for anyone building a custom connector, for example.

[Thanks to Matt Masson for telling me how all this works]

 

 

%d bloggers like this: