Understanding The “You’ve Exceeded The Capacity Limit For Dataset Refreshes” Error in Power BI

If you have a lot of Power BI semantic models that are scheduled to refresh at the same time in the Service then you may find that some of them fail with the following error:

You’ve exceeded the capacity limit for dataset refreshes. Try again when fewer datasets are being processed.

[Note: “dataset” is the old name for a Power BI semantic model. Someone should update the error message.]

What causes it? Each Fabric or Power BI Premium capacity SKU can support (and “support” is the operative word here, as we shall see) a certain number of concurrent semantic model refreshes. These limits are documented here in the Model Refresh Parallelism column of the table on that docs page:

The error itself is documented here and I’ve mentioned it myself in a previous post here, but the interesting thing about the limit on the number of concurrent refreshes there’s a lot more to it than you might expect – Power BI is very forgiving.

Before I go any further, it’s important to make clear that this error is nothing to do with how many CUs you are using on your capacity at the time of the error, although the limits are in place to stop you overloading your capacity: running multiple semantic model refreshes at the same time could cause a sizeable increase in CU consumption even after smoothing.

For example, to investigate how this limit is applied I created an F2 capacity, added a workspace to that capacity, and uploaded several identical Power BI semantic models to that workspace. I used some Power Query magic to control how long those semantic models took to refresh.

For my first test I configured two semantic models so they took 120 seconds to refresh and started a manual refresh on both at the same time. Now, looking at the table above, you might think that because an F2 supports one concurrent semantic model refresh then I would get an error but no, both semantic models refreshed successfully and both refreshes took 120 seconds. The published limit is the number of semantic models that Power BI guarantees that can be refreshed concurrently; in practice the limit may be exceeded.

Next, I started a manual refresh on six semantic models that were all configured to take 120 seconds to refresh. Again, they all refreshed successfully and all took 120-122 seconds to refresh. Finally I started a manual refresh on fifteen semantic models that again were configured to take 120 seconds to refresh and this is where I saw something different. All of the semantic models refreshed successfully in the end, and none showed the warning triangle in the first screenshot above. Most of the semantic models took 120-122 seconds to refresh but some took longer. For example, take a look at this Refresh History for one of the models:

The overall refresh was successful but took 305 seconds, not 120 seconds. This is explained by the refresh failing immediately with the “You’ve exceeded the capacity limit for dataset refreshes” error, then the Service waiting a minute to retry the refresh (for more information on automatic refresh retries see here) which resulted in the same error occuring again, then the Service waiting for a further two minutes before retrying the refresh again, at which point it succeeded and took 122 seconds.

So you can see what I mean when I say Power BI is very forgiving about these limits. It’s also worth mentioning that scheduled refreshes don’t always happen at exactly the time they are scheduled for. The Service may wait several minutes after the scheduled time before it tries the first refresh. This is what is meant by the statement in the docs here that “You can schedule and run as many refreshes as required at any given time, and the Power BI service runs those refreshes at the time scheduled as a best effort.“

In other tests with the same number of semantic models but longer refresh times, I was able to observe a scenario where a refresh scheduled for 17:30 did not start until almost eight minutes after that time and then failed nine times before it succeeded; note that the amount of time the Service waited to retry after the second failure went up to five minutes:

Of course Power BI can’t keep retrying indefinitely and eventually refreshes will fail with the “You’ve exceeded the capacity limit for dataset refreshes” error. Here’s the Refresh History for a semantic model where refresh ultimately failed after four retries (this took a lot of concurrent, slow refreshes to repro):

If you’re encountering this error then the solution is obvious: reduce the number of refreshes that are happening at any given time. But how do you know which refreshes are scheduled for when and how long they will take? The Refresh Schedule page for your capacity in the Admin Portal gives you a summary of the number of semantic models that are predicted to be refreshed in a 30 minute time slot and how long they are likely to take. The Fabric Monitoring Hub gives you details of historical activity. And if you have Workspace Monitoring or Log Analytics configured on your workspace you can get a lot of detail on what happens when refreshes are run, including seeing when the “You’ve exceeded the capacity limit” error occurs and refreshes retry.

Once you know what is being refreshed and when, you need to do two things. First see if you can reduce the number of times any given semantic model is refreshed. It’s pretty common for users to configure their model to refresh multiple times a day even if the actual data source only changes once a day, for example, so easy wins may be possible. Second, tuning the amount of time refreshes take can also reduce the amount of concurrent refreshes: if your semantic models refresh quickly it’s less likely it will overlap with other refreshes. Tuning data sources, tuning Power Query, increasing refresh parallelism, removing unnecessary columns or tables, tuning the DAX used in calculated columns and tables or replacing those calculated columns and tables with pre-calculated data in in the data source, and using incremental refresh are some of the things you will need to look at. Scaling up to a larger capacity, or buying an additional (possibly smaller) capacity and moving some workspaces over to it will also of course also solve the problem because the limits are per capacity.

In summary, what this shows is that the published, supported limits on the number of concurrent semantic model refreshes in the Power BI Service are a lot lower than what is achievable in practice. This is very important in self-service BI scenarios because it means refreshes are a lot less likely to fail than they would otherwise. But if you are refreshing a lot of semantic models, exceed the published limits on a regular basis and find some of your refreshes fail then you have no choice but to take some of the actions described above to get back under the limits.

Power BI Semantic Model Memory Errors, Part 5: The “Maximum Allowable Memory Allocation” Error

This is a very late addition to the series of posts I wrote back in 2024 and which started here on Power BI memory errors. It’s about a very rare error that is hard to deal with and often temporary but since people do run into it from time to time I decided to write about it so there is some useful information available about it online.

The error, which can occur when you refresh a semantic model or render a report, has two associated error messages:

The operation has been cancelled because there is not enough memory available for the application. If using a 32-bit version of the product, consider upgrading to the 64-bit version or increasing the amount of memory available on the machine.

or more commonly:

You have reached the maximum allowable memory allocation for your tier. Consider upgrading to a tier with more available memory

The error number associated with this error is 0xC11C0005 or -1055129595.

What causes it? This needs a bit of explanation and what follows is an over-simplification…

When you publish a Power BI semantic model to the Service it runs on one of hundreds of physical machines – nodes – alongside other semantic models published by other people. The Service always tries to put your semantic model on a node that has enough memory and CPU available for it to be queried or refreshed; if it decides that isn’t the case, the semantic model will be moved to a different node. The tricky thing is that the amount of memory or CPU available depends on whether the other semantic models on the same node are being refreshed or queried at any given time and how resource-intensive those queries and refreshes are. The limits on memory consumption that I wrote about in the previous posts in this series are there to stop any one semantic model consuming too much memory and causing problems for the other semantic models on the same node. While the algorithms used to determine which semantic models should be held together on a given node are very sophisticated (and are being improved all the time), sometimes something unexpected happens and the necessary resources aren’t available for a refresh or query. The errors above happen when the node your semantic model is being held on is under memory pressure.

What can you do about it? That’s a hard question to answer but it depends on whether your semantic model is part of the problem or not. If you only get this error once (and as I said, it’s a very rare error indeed) then you can ignore it – it’s just bad luck. However if you get this error repeatedly then it’s very likely that your semantic model is causing a memory spike and even if you aren’t hitting any other memory limit you are probably coming close and you should do some tuning. If you get this error when rendering a report you should look at the DAX queries generated by your visuals and work out whether you can reduce their memory usage by remodelling your data or rewriting the DAX in your measures. If you get this error when refreshing your semantic model you should see if you can reduce its size by remodelling your data or reduce memory consumption in other ways, for example by removing calculated columns or calculated tables and replacing them with columns and tables in your data source. For more information on how to measure memory consumption for a query or refresh, see the other posts in this series.

Connecting Power BI Semantic Models To Data Sources Automatically With Binding Hints

Did you know that you can configure your Power BI semantic model so that it automatically binds to a data source connection when you publish?

To illustrate how to do this, I created an Import mode Power BI semantic model in Power BI Desktop connected to the Products table in the ContosoSales sample database in the Azure Data Explorer help cluster. Anyone can connect to this source, you just need a Microsoft Account to authenticate. Here’s the M code from my semantic model:

			
let
  Source = AzureDataExplorer.Contents(
    "help", 
    null, 
    null, 
    [
      MaxRows                 = null, 
      MaxSize                 = null, 
      NoTruncate              = null, 
      AdditionalSetStatements = null
    ]
  ), 
  ContosoSales = Source
    {[Name = "ContosoSales"]}
    [Data], 
  Products1 = ContosoSales
    {[Name = "Products"]}
    [Data]
in
  Products1

		

I then published the model to the Service but of course at that point I couldn’t refresh the model there without the extra step of connecting the newly published model to the source. As you would expect, going to the Settings pane for the semantic model gave me the option to link my data source to a connection in the Service

No surprises so far. I deleted the published semantic model and then did two things.

First I went to the Manage Connections page in the Service and created a new Shareable Cloud Connection for the Azure Data Explorer help cluster. I made a note of the connection ID:

Second, I opened my model in Power BI Desktop, scripted out the semantic model in TMDL View, then added the following Binding Hint to the model:

			
bindingInfo '{"kind":"AzureDataExplorer","path":"help"}'
    type: dataBindingHint
	connectionId: 42906b42-3e84-461f-aee4-f14fcbeb9b72

Two things to note here:

The name of the binding hint is a JSON representation of the connection. It consists of two parts: the kind, which is the type of connection (in this case a connection to Azure Data Explorer) and the path, which is a semi-colon delimited list of all the required parameters of the function used to connect to the source (in this case AzureDataExplorer.Contents). How do you work out what the kind and path are? Originally I worked it out through trial and error, looking at the diagnostic logs and metadata from the functions used to access data, and then I realised all the information was shown in the first screenshot above when the semantic model was not linked to a connection – the kind is shown as extensionDataSourceKind and the path is shown as extensionDataSourcePath. For reference, here’s what the name of a Binding Hint for a Snowflake connection looks like:

{"kind":"Snowflake","path":"xyz.snowflakecomputing.com;COMPUTE_WH"}

The connectionId is simply the connection ID from the Shareable Cloud Connection that the semantic model should be linked to.

I then hit Apply and was prompted to upgrade the model to a compatibility level of 1608 (Binding Hints are only available at that compatibility level and above) and clicked Yes:

Having done this, I then republished the semantic model and when I checked the Settings pane it was automatically connected to the Shareable Cloud Connection I had created and could be refreshed immediately:

You can add multiple binding hints if you have multiple connections. You can also add multiple binding hints for the same data source. All in all, this is a nice little feature that might be useful if you are programmatically generating and publishing semantic models and want to avoid an extra API call to bind your model to a data source.

[Update May 2026 – after talking to some of the engineers, I’ve been told that the Fabric list connections API is the best way to get the Kind and Path for a connection]

Power BI Semantic Model Refresh Warnings

Since March 2026, Power BI semantic models have started showing warnings in their Refresh History in the Service. This has scared a few people but in fact all that is happening is that errors which were there all along and which don’t prevent refreshes from completing are now being flagged. Documentation on this feature can be found here but let’s see an example of the type of errors that can cause these warnings.

Consider the following semantic model that consists of a calculated table called Table With Error and a physical table called Sales with two physical columns called Product and Sales, two calculated columns called Sales Forecast and VAT Forecast, and two measures called Sales Amount and Tax Amount.

Here are the definitions of the calculated columns:

			
Sales Forecast = 'Sales'[Sales] * 1.1
VAT Forecast = 'Sales'[VAT] * 1.1

Here are the definitions of the measures:

			
Sales Amount = SUM('Sales'[Sales])
Tax Amount = SUM(Sales[Tax])

And here is the the definition of the calculated table:

			
Table With Error = 
FILTER(
'TableThatDoesNotExist', 
'TableThatDoesNotExist'[ColumnThatDoesNotExist]>1
)

		

There are some problems here: the VAT Forecast calculated column, the Tax Amount measure and the Table With Error calculated table all return errors because they refer to tables or columns that do not exist. You can see these errors in Power BI Desktop easily, for example in the Data pane where these items have warning triangles next to them:

…or if you look at their definitions:

None of these errors stop you from refreshing or publishing but of course you can’t use any of these items in your reports.

If you do publish and refresh this semantic model via the UI (although this does not happen if you refresh via the XMLA Endpoint) you’ll see the message “Refresh completed with warnings”:

If you click the Show link in the Details column and then the Show link in the yellow box that appears, you’ll see a dialog showing the errors for all the broken items:

If you see warnings like this you should probably go and either fix the items that are causing them or delete them. Errors like this happen frequently when you delete items in your semantic model that have measures, calculated columns or calculated tables that depend on them; there are plenty of other similar scenarios that will cause errors too.

Power BI And Support For Third Party Semantic Models

I’ve been working with Microsoft BI tools for 28 years now and for all that time Microsoft has been consistent in its belief that semantic models are a good thing. Fashions have changed and at different times the wider BI industry has agreed and disagreed with this belief; right now, semantic models are cool again because everyone has realised how important they are for AI. As a result, some of Microsoft’s partners and competitors (and sometimes it’s not clear which is which) have invested in building their own semantic models and/or metrics stores, some of which don’t work at all with Power BI, some of which only work with significant limitations, and a very small number which are fully supported and work with only minor limitations. This naturally raises the question of whether Power BI will ever work properly with any or all of them. The answer is no, and in this blog post I’ll explain why.

The first thing to make clear is that the reasons why some semantic models work well with Power BI and others don’t are purely technical. It is not because Microsoft has some grand plan to stifle competing BI tools. If you look at Fabric as a whole, you’ll see that Microsoft works closely with Databricks, Snowflake, DBT and many other companies to ensure that it integrates closely with them and gives customers the option to work with whichever other tools they want to use. In Power BI there are connectors to a wide range of data sources, not just Microsoft ones. Over the last year the Power BI team has spoken to all major vendors of third-party semantic models about integration with Power BI and it has been clear about what is and isn’t technically feasible. The door remains open for future collaboration and Microsoft respects the motives of these other vendors, in particular those who are developing open standards.

To understand the technical issues, let’s look at the architecture of a simple Power BI solution that uses an Import mode semantic model – as the vast majority of Power BI solutions do:

In this case the data from the data sources is copied into the Power BI semantic model, which also contains information on how the different tables of data should be joined to each other, measures (defined in the DAX language) describing how data should be aggregated and how more complex business calculations should be performed, which columns are visible and which ones are hidden, and a lot more. When the Power BI report is rendered it sends queries, again in the DAX language, to the semantic model to get the data it needs for each visual.

How could a third-party semantic model be used instead here? Power BI reports connect to Power BI semantic models using the XMLA protocol, and that means that Power BI reports can also connect to older Azure Analysis Services and SQL Server Analysis Services semantic models too. Some vendors have come up with a solution whereby they implement support for XMLA and tell their customers to connect to their semantic models using the SQL Server Analysis Services connector. This works up to a point but as you can imagine, using the SQL Server Analysis Services connector to connect to something that isn’t SQL Server Analysis Services is not supported and not wholly reliable.
It’s worth noting that using a third-party semantic model as a data source for an Import mode Power BI semantic model is not an option either because if Power BI imports metrics like percentage shares or time intelligence calculations it will not be able to aggregate data and get the correct result. Most metrics need to be calculated after the base data has been aggregated to work properly.

There are two other storage modes available for Power BI semantic models: Direct Lake and DirectQuery. Direct Lake only works with data stored in, or which can be reached via a shortcut from, Fabric OneLake so we don’t need to discuss it here. In DirectQuery mode the Power BI semantic model doesn’t store any data and instead, when it is queried, it generates SQL queries to get the data it needs from a data source on demand.

Other vendors of third-party semantic models have taken the approach of suggesting the use of Power BI in DirectQuery mode and having it run SQL against their semantic model. Apart from the fact that DirectQuery mode is usually slower and less cost-effective than Import mode or Direct Lake mode, your first reaction to this would probably be that putting one semantic model on top of another semantic model doesn’t make any architectural sense and you’d be right. There are several serious problems that emerge when you try to use Power BI in this way.

For example, a Power BI semantic model assumes that you have your data modelled as a star schema and that it will be able to generate SQL that joins dimension tables to fact tables. Not all third-party semantic models support something as basic as this yet. What’s more a Power BI semantic model assumes that it will be where all metrics will be calculated, which means that despite some interesting workarounds by third-party vendors (such as making the SQL SUM() function not actually sum up values) you can never be sure that you’ll get the correct values for a metric defined in a third-party semantic model, for example for subtotals or grand totals. There are a lot of other, similar problems that the Power BI team have made these third-party semantic model vendors aware of. These problems are not specific to Power BI semantic models either: no other semantic model would work well with another semantic model as its source.

If you can’t use Power BI semantic models on top of third-party semantic models, is it an option to synchronise calculations defined in a third-party semantic model to a Power BI semantic model? Yes, that is certainly possible and supported, and some of our partners (such as our friends at Tabular Editor) have already started down this path. DAX is a very rich language for defining metrics and Microsoft has invested a lot recently in making changes to Power BI semantic models programmatically as easy as possible. Without a doubt any metrics defined in a third-party semantic model can be reproduced in DAX, although since DAX is a much better fit for defining metrics than SQL you’ll probably find that some of the metrics you need can only be defined in DAX. In which case, rather than defining some of your metrics in a third-party semantic model and some in a Power BI semantic model, why not define all of them in your Power BI semantic model?

The final point to make is that Power BI semantic models can be used with a wide range of BI tools, not just Power BI reports. Apart from Microsoft tools like Excel and Fabric Paginated Reports, Tableau and several other non-Microsoft tools that you might think of as competitors to Power BI can also be used as a front-end for Power BI semantic models and this is supported. There is nothing stopping other BI tools from implementing connectivity to Power BI semantic models in the future. In Fabric you can even query a Power BI semantic model in SQL and extract data into a Pandas Dataframe in Python using the Semantic Link library. Anyone arguing that Power BI semantic models are somehow not “open” is wrong.

I’ll be honest, I think a lot of the reason why organisations that already use Power BI extensively consider third-party semantic models is because some people – not the Power BI users themselves, often people from a data engineering or database background – think of Power BI as just a visualisation tool and don’t realise that it also has the most mature, capable, widely used semantic model available in the market today. It is designed for both self-service and enterprise BI scenarios. Microsoft has no plans to make Power BI’s front end work properly with anything other than its own semantic models because that would be a huge amount of work with few benefits to customers: these third-party semantic models all behave differently and are at different levels of maturity, so any changes made in Power BI to accommodate them would risk breaking existing functionality or limit the use of advanced features. 35 million users view Power BI reports every month and those users query 20 million Power BI semantic models. Microsoft’s strategy is to continue to invest and strengthen Power BI semantic models for those customers. So if Power BI is how you want your end users to consume data, then Power BI semantic models, not any other third-party semantic model or metrics store, are the right place to store your metrics definitions and your business logic.

Role-Playing Dimensions In Fabric Direct Lake Semantic Models Revisited

Back in September 2024 I wrote a blog post on how to create multiple copies of the same dimension in a Direct Lake semantic model without creating copies of the underlying Delta table. Not long after that I started getting comments that people who tried following my instructions were getting errors, and while some bugs were fixed others remained. After asking around I have a workaround (thank you Kevin Moore) that will avoid all those errors, so while we’re waiting for the remaining fixes here are the details of the workaround.

Let’s say you have a Direct Lake on OneLake semantic model with two tables, a fact table called Conversation and a dimension table called Person. The Conversation fact table has one row for a conversation between two people, but at this point there is only one Person dimension in the model with a relationship from the FromPersonId column on Conversation to the PersonId column on Person:

How can you add a second copy of the Person dimension table without duplicating the data in OneLake?

In Power BI Desktop, while editing the semantic model, go to TMDL View and in the Data pane on the right hand side switch to the Model pane:

Expand Expressions and drag it into the TMDL pane to script it out. It should look something like this:

Then you need to make two changes:

On the line that starts “expression”, line 3 in the screenshot above, change the name of the expression to something new and unique
Delete the line that contains the lineage tag, line 8 in the screenshot above

Here’s what it should look like after:

Click Apply and this will create a duplicate Expression in the model. This is the trick that makes everything else work.

Next, create a new script in TMDL View and drag the Person dimension into it to script it out.

Then make the following changes to the script:

On the line that starts “table”, line 3 in the screenshot above, change the name of the table to something new and unique
One the line that starts “expressionSource”, line 31 in the screenshot above, change the name of the source expression to that of the new Expression created earlier
Delete all lines with lineage tags, ie those that start “lineageTag”
Add one line at the end with the text “changedProperty = Name”

Here’s what it should look like after:

Click Apply and this will create a copy of the dimension in the semantic model.

Then, back in Diagram View, you’ll see the new dimension table but with a warning saying that it hasn’t been refreshed. The next step is to refresh the model using the Schema and Data option:

At this point the new dimension table can be used like any other table, so you can create the relationship between the ToPersonId column on Conversation and the PersonId column on the new ToPerson dimension:

Power BI, Parallelism And Dependencies Between SQL Queries In DirectQuery Mode

This is going to sound strange, but one of the things I like about tuning Power BI DirectQuery semantic models is that their generally-slower performance and the fact you can see the SQL queries that are generated to get data makes it much easier to understand some of the innermost workings of the Power BI engine. For example this week I was trying to tune a DAX query on a DirectQuery model using DAX Studio and the Server Timings showed me something like this:

As I described here, Power BI can send SQL queries in parallel in DirectQuery mode and you can see from the Timeline column there is some parallelism happening here – the last two SQL queries generated by the DAX query run at the same time – but everything has to wait for that first SQL query to complete. Why? Can this be tuned?

Here’s the scenario that produced the query above. I have a DirectQuery semantic model built from the ContosoDW SQL Server sample database:

There are three base measures defined:

			
Distinct Customers = DISTINCTCOUNT(FactOnlineSales[CustomerKey])
January Customers = 
CALCULATE([Distinct Customers], 
KEEPFILTERS('DimDate'[CalendarMonthLabel]="January"))
Monday Customers = 
CALCULATE([Distinct Customers], 
KEEPFILTERS('DimDate'[CalendarDayOfWeekLabel]="Monday"))

		

Note that these measures are written specifically to prevent fusion from taking place: each measure generates a separate SQL query. Here’s what DAX Studio’s Server Timings shows for the DAX query generated for the table shown above:

As you can see, the three SQL queries generated by the DAX query are run in parallel.

Now consider the following measure:

			
IF Test = IF([Distinct Customers]>3000, [January Customers], [Monday Customers])

Here’s what this measure returns:

If you run the query generated for this visual in DAX Studio, Server Timings shows what I showed in the first screenshot in this post:

The last two substantial SQL queries, on lines 4 and 5, can only run when the first SQL query, on line 1, has finished. The details of SQL queries tell you more about what’s going on here. The first SQL query, on line 1, just gets the values for the [Distinct Customers] measure for all rows in the table:

The WHERE clauses for the SQL queries on line 4:

..and line 5:

…show that these last two queries only get the values for the [January Customers] and [Monday Customers] measures for the rows where the [IF Test] measure needs to display them. And this explains why the first SQL query has to finish before these last two SQL queries can be run: the WHERE clauses of these last two SQL queries are constructed using the results returned by the first SQL query.

There is another way of evaluating the IF condition in the [IF Test] measure. Instead of “strict” evaluation, where the engine only gets the value of [January Customers] for the rows in the table where [Distinct Customers] is greater than 3000 and only gets the value [Monday Customers] for the remaining rows, it can get values for [January Customers] and [Monday Customers] for all rows in the table and then throw away the values it doesn’t need. This is “eager” evaluation and as you would expect, Marco and Alberto have a great article explaining strict and eager evaluation here that is worth reading; Power BI can decide to use either strict or eager evaluation with the IF function depending on which one it thinks will be more efficient. However you can force the use of eager evaluation by using the IF.EAGER DAX function instead of IF:

			
IF EAGER Test = 
IF.EAGER([Distinct Customers]>3000, [January Customers], [Monday Customers])

Here’s what Server Timings shows for the DAX query that uses IF.EAGER:

As you can see, the use of IF.EAGER means that the three substantial SQL queries generated by Power BI for this DAX query can now be run in parallel because there are no dependencies between them: they get the values of [Distinct Customers], [January Customers] and [Monday Customers] for all rows in the table. However, even though these three SQL queries are now run in parallel, it doesn’t result in any performance benefits here because it looks like the three queries are slower as a result of all being run at the same time. Power BI has made the right call to use strict evaluation with the IF function in this case but if you see it using strict evaluation I think it’s worth experimenting with IF.EAGER to see if it performs better – especially in DirectQuery mode where Power BI knows less about the performance characteristics of the database you’re using as your data source.

[Thanks to Phil Seamark for helping me understand this behaviour]

Measuring Power BI Report Page Load Times

If you’re performance tuning a Power BI report the most important thing you need to measure – and the thing your users certainly care about most – is how long it takes for a report page to load. Yet this isn’t something that is available anywhere in Power BI Desktop or in the Service (though you can use browser dev tools to do this) and developers often concentrate on tuning just the individual DAX queries generated by the report instead. Usually that’s all you need to do but running multiple DAX queries concurrently can affect the performance of each one, and there are other factors (for example geocoding in map visuals or displaying images) that affect report performance so if you do not look at overall page render times then you might miss them. In this post I’ll show you how you can measure report page load times, and the times taken for other forms of report interaction, using Performance Analyzer in the Service and Power Query.

Consider the following series of interactions with a published Power BI report:

The report itself isn’t really that important – just know that there are a series of interactions with a slowish report while Performance Analyzer is running. Here’s what Performance Analyzer shows by the end of these interactions:

Here’s a list of the interactions captured:

I changed from a blank report page to a page with a table visual, where the table visual was cached and displayed immediately
I then refreshed the table visual on that page by clicking the Refresh Visuals button in the Performance Analyzer pane
I changed to the next page in the report and all the visuals on that page rendered
I changed the slicer on that new page
I clicked on the bar chart to cross-filter the rest of the page

As you can see from the screenshot above, Performance Analyzer tells you how long each visual takes to render within each interaction but it doesn’t tell you how long each interaction took in total. In a lot of cases you can assume that the time taken for an interaction is the same as the time taken for the slowest visual to render, but that may not always be true.

So how can you use Performance Analyzer to measure the time taken for these interactions? How can you measure the amount of time taken to render a page in a report?

To solve this problem I created a Power Query query that takes the event data JSON file that you can export from Performance Analyzer and returns a table showing the amount of time taken for each interaction. Here’s the M code for this query:

			
let
    Source = Json.Document(File.Contents("C:\PowerBIPerformanceData.json")),
    ToTable = Table.FromRecords({Source}),
    Events = ToTable{0}[events],
    EventTable = Table.FromList(Events, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(EventTable, "Column1", {"name", "start", "id", "metrics", "end"}, {"name", "start", "id", "metrics", "end"}),
    #"Expanded metrics" = Table.ExpandRecordColumn(#"Expanded Column1", "metrics", {"sourceLabel"}, {"sourceLabel"}),
    #"Added Custom1" = Table.AddColumn(#"Expanded metrics", "UserActionID", each if [name]="User Action" then [id] else null),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "UserActionLabel", each if [name]="User Action" then [sourceLabel] else null),
    #"Changed Type" = Table.TransformColumnTypes(#"Added Custom2",{{"start", type datetime}, {"end", type datetime}, {"UserActionID", type text}, {"sourceLabel", type text}, {"UserActionLabel", type text}}),
    #"Filled Down" = Table.FillDown(#"Changed Type",{"UserActionID", "UserActionLabel"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each [start] > #datetime(1970, 1, 2, 0, 0, 0)),
    #"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [end] > #datetime(1970, 1, 2, 0, 0, 0)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows1", {"UserActionID", "UserActionLabel"}, {{"Start", each List.Min([start]), type nullable datetime}, {"End", each List.Max([end]), type nullable datetime}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Duration", each [End]-[Start], type duration),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"UserActionID"})
in
    #"Removed Columns"

		

Here’s the output of this query for the interactions shown above:

Some notes about this query:

You will need to change the Source step to point to the JSON file you have exported from Performance Analyzer
Each interaction is represented by a row in the table and identified by the UserActionLabel column
I’m calculating the durations by finding the minimum start time and the maximum end time for all events associated with an interaction and subtracting the former from the latter
There’s a bug (which hopefully gets fixed at some point) where some events have start and end dates in 1970, so I have filtered out any dates that are obviously wrong
The Duration column shows how long each interaction took and uses the Power Query duration data type, which is formatted as days.hours:minutes:seconds

The example above is fairly complex showing several different kinds of interactions. If you just want to find the amount of time taken to render all the visuals on a page you can click the Refresh Visuals button in Performance Analyzer to refresh all the visuals on the page – it may not give you a 100% “cold cache” page render but it will be good enough. I’m not a web developer but I think to really do things properly you’ll need to open the report on a blank page in the browser, do an “Empty Cache Hard Reload“, go to edit mode in the report, enable Performance Analyzer, then move to the page you want to test. If you’re testing a DirectQuery model then you’ll also want to include the overhead of opening connections (which can be substantial); the only way I have found to do that is either wait for at least an hour for any connections in the pool to be dropped, or if you’re using a gateway to restart it. One last point to make is that while you can use Performance Analyzer in Power BI Desktop and in the browser the behaviour of Power BI may be different in these two places, so always make sure you measure performance of published reports in the browser because that’s where your users will be using your reports.

Here’s what clicking the Refresh Visuals button in Performance Analyzer to refresh all the visuals on a page looks like:

This results in a single interaction and a single row in the output of the Power Query query above:

In this case you can see that the page refresh took 12.14 seconds.

As you will have realised by now, getting the amount of time it takes to load a report page isn’t straightforward and there are a lot of factors to take into account. Nonetheless using Performance Analyzer in this way is much better than not measuring page load times at all or (as I’ve seen some people do) using a stopwatch. If you try this and find something interesting please let me know: I’m doing a lot of testing with Performance Analyzer and learning new things all the time.

New Books: “The Definitive Guide To DAX” 3rd Edition And “Microsoft Power BI Visual Calculations”

For some reason I haven’t had any free copies of books to review recently; maybe the market for tech books has finally collapsed with AI? Books are still being published though and luckily, as someone who once published a book via an O’Reilly imprint, I have a lifetime subscription to O’Reilly online learning which gives me free access to all the tech books I ever need. Two books were published in the last few months that I was curious to read: the third edition of “The Definitive Guide To DAX” by my friends Marco Russo and Alberto Ferrari, and “Microsoft Power BI Visual Calculations” by my colleague Jeroen ter Heerdt, Madzy Stikkelorum and Marc Lelijveld. As I’ve said many times, I don’t write book reviews here (least of of reviews of books by friends or colleagues where I could never be unbiased), but I think there’s some value sharing my thoughts on these books.

“The Definitive Guide To DAX”, 3rd Edition

It’s generally accepted that the one book that anyone who is serious about Power BI should own is “The Definitive Guide To DAX”. If you don’t already own a copy you should buy one, but since most people who read my blog probably have one already the more interesting question to ask is what’s new in the third edition and whether it’s worth upgrading – especially since I’d seen Marco say that the book had been completely rewritten. I’ve heard the “completely rewritten” line before and I was sceptical but it turns out that it really is a very different book. It’s not completely rewritten because there is material there from previous editions but there are a lot of changes.

First of all, as you would expect, all the new additions to DAX since the second edition was published are covered including user defined functions, visual calculations, calendar-based time intelligence functionality and window functions. These are all really important features you will want to use in your semantic models and reports so this is the main reason you’d want to buy a copy of this edition.

Secondly, the main (and justfied) criticism of the previous editions was that they were, as we say in the UK, “heavy going”. They had absolutely all the information you would ever need but they were not the easiest books to read or understand. That has been addressed in the third edition: the tone is a little bit more friendly and difficult concepts are now explained visually as well as in text. As a result it’s easier to recommend the book for beginners.

Thirdly, some advanced topics (for example around performance tuning) have been dropped. For example I searched for the term “callback” in this new edition and found no mentions; that’s not true of the second edition. I have mixed feelings about this because it means the book isn’t as “definitive” as it used to be, but I can understand why it’s happened: with so much new content to add, keeping these advanced topics would have made an already long book too long. And let’s be honest, how often do you look at the details of a DAX query plan? If the aim is to teach DAX then cutting content means it’s easier for the reader to focus on the core concepts.

In summary, then, another great piece of work from Marco and Alberto and worth buying even if you have a copy of an earlier edition.

“Microsoft Power BI Visual Calculations”

A whole book about visual calculations? As I mentioned above, they’re covered in one chapter of “The Definitive Guide To DAX” but that book focuses on DAX; this one takes more time to explain the concepts and, crucially, includes a lot of practical examples of how to use them. Like user-defined functions, when visual calculations were released there was an explosion of community content showing how they can be used to solve problems that were difficult to solve in Power BI before – problems that no-one could have been anticipated that would be solved with visual calculations. The real value of this book is showing how to build a bump chart or a tornado chart with visual calculations and that makes it worth checking out.

Closing thoughts: why buy a book?

As you would expect, a lot of the information contained in these books is already available for free somewhere on the internet. And with AI you don’t even need to know how to search for it or stitch it all together – you can ask a question and get an answer customised to your exact scenario. So why buy books any more? I guess it depends on whether you only want to get your problems solved or understand how to solve problems yourself. For me (even though my attention span has eroded in recent years, just like everyone else’s) the only way to grasp really difficult concepts is through long-form written explanations or training courses, not fragments found in blog posts or 10-minute videos. I suspect that AI is the final nail in the coffin of the tech publishing industry but the tech book industry not being viable any more is not the same thing as tech books not being useful any more. Or maybe I’m just old-fashioned.

New Performance Optimisation for Excel PivotTables Connected To Power BI Semantic Models

Some good news: an important optimisation has rolled out for Excel PivotTables connected to Power BI semantic models! Back in 2019 I wrote about a very common problem affecting the MDX generated by PivotTables connected to Analysis Services where, when subtotals are turned off and grand totals are turned on, the query nevertheless returns the subtotal values. This led to extremely slow, expensive MDX queries being run and a lot of complaints. The nice people on the Excel team have now fixed this problem and PivotTables connected to Power BI semantic models generate MDX queries that only return the values needed by the PivotTable.

Here’s an example of a PivotTable connected to a published Power BI semantic model:

Note that the subtotals have been turned off but the grand totals are displayed – this is important. Here’s the MDX query generated for this PivotTable:

SELECT NON EMPTY 
{ /* GTOPT-BEGIN CSECTIONS=2 */ 
 /* GTOPT-SECT-BEGIN-1 Desc:GrandTotal */ 
{([Property Transactions].[New].[All],[Property Type].[Property Type Name].[All])}
 /* GTOPT-SECT-END-1 */ 
,
 /* GTOPT-SECT-BEGIN-2 Desc:Detailed */ 
{Hierarchize(CrossJoin({[Property Transactions].[New].[New].AllMembers}, 
{([Property Type].[Property Type Name].[Property Type Name].AllMembers)}))}
 /* GTOPT-SECT-END-2 */ 
} /* GTOPT-END */ 
 DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME ON COLUMNS  
 FROM [Model] 
 WHERE ([Measures].[Count Of Sales]) 
 CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

And here’s what this query returns:

There are 11 values displayed in the PivotTable and the MDX query returns 11 values. It’s what you’d expect but as I said, up to now, Excel would have generated an MDX query that returned 13 values – a query that also requested the subtotal values that aren’t displayed.

This optimisation should now be rolled out to 100% of Excel users. You can tell if you are using the new query pattern by looking for comments in the MDX code with the text “GTOPT” in – they’re easy to spot in the query shown above. Right now the optimisation only happens for PivotTables connected to Power BI semantic models but I’ve been told that in future it should also happen for PivotTables connected to Azure Analysis Services and SSAS; this is because some server-side optimisations are necessary to make the new MDX perform as well as possible.

You might be thinking that, despite my excitement, this is a very niche scenario but I assure you it’s not: Excel users very frequently create PivotTables that are formatted to look like tables, and having subtotals turned off and grand totals turned on is a key part of this. The more fields that are put on rows the more subtotals there are to calculate and the more the overhead increases; it’s not uncommon to find situations where the number of subtotal values is much greater than the number of values actually displayed in the PivotTable.

This doesn’t solve all the performance problems associated with PivotTables and Power BI though and more work is planned for the future.

[Thanks to Yaakov Ben Noon for driving this work!]

Category: Power BI

Understanding The “You’ve Exceeded The Capacity Limit For Dataset Refreshes” Error in Power BI

Like this:

Power BI Semantic Model Memory Errors, Part 5: The “Maximum Allowable Memory Allocation” Error

Like this:

Connecting Power BI Semantic Models To Data Sources Automatically With Binding Hints

Like this:

Power BI Semantic Model Refresh Warnings

Like this:

Power BI And Support For Third Party Semantic Models

Like this:

Role-Playing Dimensions In Fabric Direct Lake Semantic Models Revisited

Like this:

Power BI, Parallelism And Dependencies Between SQL Queries In DirectQuery Mode

Like this:

Measuring Power BI Report Page Load Times

Like this:

New Books: “The Definitive Guide To DAX” 3rd Edition And “Microsoft Power BI Visual Calculations”

Like this:

New Performance Optimisation for Excel PivotTables Connected To Power BI Semantic Models

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: