You probably know that it’s a best practice to build your Power BI datasets in a separate .pbix file from your reports – among other things it means that different people can develop the dataset and reports. You may also know that if you are building a report in Power BI Desktop with a Live connection to a published dataset or Azure Analysis Services you can define your own measures inside the report. While this is very convenient, if you create too many measures there’s a price to pay in terms of query performance.
To illustrate this, let’s say you have a super-simple dataset published to the Power BI Service (or a database in Analysis Services Tabular or Azure Analysis Services) that contains one table with three rows in it, two columns and a simple measure:
If you open Power BI Desktop and create a Live connection to this dataset, you can create a new measure in the normal way and then use it in a table like so:
If you take a look at the DAX query that is generated by this table visual you’ll notice that the MyReportMeasure measure, defined in the report, is defined at the top of the query while the Sales Amount measure, defined in the dataset, is not:
DEFINE MEASURE 'Sales'[MyReportMeasure] = ( [Sales Amount] + 1 ) VAR __DS0Core = SUMMARIZECOLUMNS ( ROLLUPADDISSUBTOTAL ( 'Sales'[Product], "IsGrandTotalRowTotal" ), "Sales_Amount", 'Sales'[Sales Amount], "MyReportMeasure", 'Sales'[MyReportMeasure] ) VAR __DS0PrimaryWindowed = TOPN ( 502, __DS0Core, [IsGrandTotalRowTotal], 0, 'Sales'[Product], 1 ) EVALUATE __DS0PrimaryWindowed ORDER BY [IsGrandTotalRowTotal] DESC, 'Sales'[Product]
Here’s what DAX Studio’s Server Timings shows about this query when it runs on a cold cache:
As you would expect it’s pretty quick, taking just 16ms.
In this example MyReportMeasure is something known as a query-scoped measure: it is created when the query runs and ceases to exist when the query finishes. The problem with this is that creating a query has some costs associated with it: for example, Power BI/Analysis Services needs to do some dependency analysis to find out what other measures it refers to, and the more other measures there are, the longer this takes.
To show the impact I generated the DAX definition of 3000 measures in Excel and pasted them into the DEFINE clause of the query above:
[NB this is not exactly what happens in the real world: only the measures you need for a query, and the measures that these measures depend on, are defined in the query but the dependendency analysis happens all the same]
Here’s what Server Timings showed for the same query – which, remember, does not actually used any of the 3000 measures that I added:
Now 3000 measures might seem excessive but I have seen people with that many: you could have 100 base measures and then 30 combinations of different KPIs (time intelligence calculations, financial calculations like actual vs forecast and so on). My advice would be to use calculation groups instead of creating so many measures, if you can – they will be a lot easier to develop and maintain, and for anyone developing a report to use. It’s also worth making clear that this problem only happens with query-scoped measures: no dependency analysis takes place at query time with measures defined on the dataset.
Also 1.5 seconds might not seem a big overhead but if you’re trying to squeeze all the performance you get out of a query, or trying to understand what’s contributing to the overall performance of your query, this is good to know about.
[Thanks to Jeffrey Wang for providing the information in this post]