Finding Power BI Queries And Refreshes That Have Been Throttled With Workspace Monitoring

A lot of new documentation was published recently around Fabric capacities, including documentation on how to size and govern capacities and how to troubleshoot problems with capacities. The latter has instructions on how to use the Capacity Metrics App to determine if your Power BI queries are being throttled; it mentions that you can also determine if queries have been throttled using Workspace Monitoring or Log Analytics but doesn’t go into details about how, something I will address in this post.

The key to using Log Analytics and Workspace Monitoring to troubleshoot throttling is the Execution Metrics event. Queries and semantic model refreshes always have an Execution Metrics event associated with them, and you can make the connection between all events for a specific operation using the OperationId column in Workspace Monitoring (as mentioned in my last post) or the XmlaRequestId column in Log Analytics – these columns contain the same information, it’s just that the column name was changed in Workspace Monitoring. When a query or a refresh is delayed because of throttling the Execution Metrics event will include a metric called capacityThrottlingMs, which gives the total amount of time in ms that the query or refresh was delayed before it was allowed to execute.

The documentation for Execution Metrics contains a sample KQL query for Log Analytics that uses Execution Metrics to return aggregated data on throttling but doesn’t explain how it works. Here’s a version of that query rewritten for Workspace Monitoring that returns the individual query and refresh operations that were throttled in the last five days:

let
EventsLast5Days =
SemanticModelLogs
| where Timestamp > (5d)
| where OperationName in ("CommandEnd", "QueryEnd", "DiscoverEnd", "ExecutionMetrics");
let 
ExecutionMetricsEvents =
EventsLast5Days
| where OperationName == "ExecutionMetrics"
| project ExecutionMetrics = EventText, OperationId
| extend EM = parse_json(ExecutionMetrics)
| extend capacityThrottlingMs = toint(EM.capacityThrottlingMs)
| project-away  EM, ExecutionMetrics
| where capacityThrottlingMs > 0;
let
QueryDiscoverRefreshEvents = 
EventsLast5Days
| where OperationName in ("CommandEnd", "QueryEnd", "DiscoverEnd")
| project Timestamp, OperationName, OperationDetailName, QueryText = EventText, DurationMs, CpuTimeMs, OperationId, ExecutingUser;
ExecutionMetricsEvents
| join kind=inner QueryDiscoverRefreshEvents on OperationId
| project-away OperationId1

Here’s how this query works:

  • The EventsLast5Days variable gets all the events in the last 5 days which have an OperationDetailName of either CommandEnd (which is fired when a refresh finishes), QueryEnd (which is fired when a DAX or MDX query finishes), DiscoverEnd (which is fired when a Discover command, which is used to retrieve metadata from the Power BI engine, finishes) or ExecutionMetrics.
  • The ExecutionMetricsEvents variable takes the EventsLast5Days variable and filters it down to just the ExecutionMetrics event. It then parses the JSON value containing the metrics in the EventText column, extracts the capacityThrottlingMs metric if it is present, and filters the events down to just those where capacityThrottlingMs is greater than 0.
  • The QueryDiscoverRefreshEvents variable takes the EventsLast5Days variable and filters it down to just the CommandEnd, DiscoverEnd and QueryEnd events.
  • Finally there is an inner join between the ExecutionMetricsEvents variable and the QueryDiscoverRefreshEvents variable on the OperationId column, which results in a table of the CommandEnd, DiscoverEnd and QueryEnd events where throttling was present.

A couple of important things to point out:

  • The duration of an event in Workspace Monitoring, given in the DurationMs column, does not include any delays added as a result of capacity throttling. As a result you can have queries that take 0ms but which were delayed by 20 seconds – something you might not realise if you only look at the DurationMs column.
  • Having the DAX queries that are run when you display a Power BI report is bad enough, but throttling delays are also applied to Discover commands, and Power BI reports (and indeed every client tool) will run a number of these before they run the queries they needs to render a report. This means that the impact of throttling on report performance can be a lot greater than you might expect.
  • The new Timepoint Item Detail page in the Capacity Metrics App, in preview at the time of writing, includes a slicer on the OperationId column. This means that you can do the opposite of what I showed in my previous blog post and filter data in the Capacity Metrics App using an OperationId found from a query in Workspace Monitoring. This would allow you to find the percentage of the available CUs on a capacity that a given query or refresh used.

How To Get The Details Of Power BI Operations Seen In The Capacity Metrics App

It’s the week of Fabcon Europe and you’re about to be overwhelmed with new Fabric feature announcements. However there is a new blink-and-you’ll-miss-it feature that appeared in the latest version of the Fabric Capacity Metrics App (released on 11th September 2025, version 47) that won’t get any fanfare but which I think is incredibly useful – it allows you to link the Power BI operations (such as queries or refreshes) you see in the Capacity Metrics App back to Workspace Monitoring, Log Analytics or Profiler so you can get details such as the query text.

Let’s say you’re in the Capacity Metrics App in the existing Timepoint Detail page. On the top right hand corner of both the “Interactive operations for time range” and “Background operations for time range” tables there is a dropdown box that allows you to display additional columns. This box now contains an option to display the OperationId column:

After you’ve added this column you’ll see it contains a GUID:

There is also, incidentally, a new page in preview called Timepoint Item Detail (preview) which is reached through the new Timepoint Summary (preview) page and which will eventually replace the Timepoint Detail page. If you haven’t seen this you should check it out: I think it’s a big improvement. This also has a dropdown box that allows you to show the OperationId column in its versions of the “Interactive operations for time range” and “Background operations for time range” tables.

This page also has a dropdown box at the top that allows you to filter operations by OperationId.

The OperationId is a unique identifier for each Power BI operation. Right clicking on this value and selecting Copy/Copy value to copy it:

…means that you can use this value to cross-reference with the log data you find in Workspace Monitoring, Log Analytics or Profiler. For example I have a workspace with Workspace Monitoring enabled and found the following OperationId in the Capacity Metrics App: a7a2d4d4-2a9b-4535-b65a-a0cc0389d821. The following KQL query run on Workspace Monitoring:

let
OperationIdFromCapacityMetrics = "a7a2d4d4-2a9b-4535-b65a-a0cc0389d821";
SemanticModelLogs
| where OperationId == OperationIdFromCapacityMetrics
| where OperationName == "QueryEnd"
| project Timestamp, ItemName, EventText, DurationMs, OperationDetailName
| order by Timestamp

…returns the query text for the DAX query associated with this interactive operation:

There’s a lot of other information you can get by writing KQL queries from Workspace Monitoring (for some examples see here) such as the IDs of the visual and the report that generated the query. If you’re using Log Analytics or Profiler there is no OperationId column – it’s called XmlaRequestId in Log Analytics and RequestId in Profiler – but the same information is available there too.

This is very useful for admins trying to identify why a capacity is overloaded: it means that you can now see the details of expensive queries or refresh operations and understand why they are causing problems. Make sure you upgrade your Capacity Metrics App to the latest version and enable Workspace Monitoring on all important workspaces so you can do this!

[Thanks to Tim Bindas and Lukasz Pawlowski for letting me know about this]

Limit The Impact Of Expensive Power BI Queries On Your Capacity By Reducing The Query Timeout

The recent announcement of Surge Protection gives Fabric/Power BI capacity admins a way to restrict the impact of background operations on a capacity, preventing them from causing throttling. However, at the time of writing, Surge Protection does not prevent users that are running expensive DAX or MDX queries – which are interactive operations – from causing problems on your capacity. Indeed, right now, there is no direct way to stop runaway queries from consuming a lot of CUs, although there is something you can do which will help a lot: reducing the query timeout.

Surge Protection doesn’t address the problem of expensive queries yet because Power BI only knows the CU usage of a DAX or MDX query when it has finished running – by which time it’s too late to do anything about it. In many cases, though, DAX or MDX queries that consume a lot of CUs are also slow. Therefore reducing the query timeout, which will kill any query that runs longer than a specified duration, will stop these queries from consuming so many CUs.

There are two default query timeouts that you should be aware of in Power BI. First, all DAX queries generated by a Power BI report have a 225 second timeout applied by the report itself. This timeout can be changed in Power BI Desktop but it cannot be changed on a published report in the Power BI Service. Second, you can set a timeout at the capacity level by changing the Query Timeout property in the admin portal. The default setting here is 3600 seconds (one hour). Unlike the first timeout, which only applies to the DAX queries generated by a Power BI report, this timeout applies to all queries run on any semantic model associated with the capacity, including the MDX queries generated by Excel PivotTables via Analyze In Excel. Setting this second timeout to less than 225 seconds means that it will take precedence over the first timeout. Therefore it’s the Query Timeout property on your capacity that you should set.

Hitting a timeout in a Power BI report will give the user a “Query has exceeded the available resources” error; clicking See Details/More Details will give you a message like this:

The XML for Analysis request timed out before it was completed. Timeout value: 10 sec.

Hitting the query timeout in an Excel PivotTable will give you the same message:

What value should you set the Query Timeout to? In my opinion no query should ever run for more than 30 seconds because anything slower will result in a poor experience for your end users – no-one wants to sit around for ages waiting for a report to render. I also think it should be possible to tune any semantic model so all queries run under 30 seconds if you know what you’re doing. That said, in the real world, setting a timeout of 30 seconds may be unrealistic: developers may not have the skills to tune their semantic models. As a result I find a timeout of 100 seconds is often a good compromise but you should experiment with different timeouts to see what the minimum value you can get away with is.

It’s important to note that reducing the query timeout will not stop every expensive query. This is because it’s perfectly possible to have very fast queries that consume a lot of CUs – for example when distinct count measures are used, and/or when there are very large data volumes and/or when there are complex but highly-optimised measures. Also there relatively rare cases where a query will carry on running beyond the duration specified by the timeout, because the Vertipaq engine only checks if the timeout has been exceeded at certain points in the code and depending on the query there could be several seconds (sometimes more) between these checks. Equally, some very slow queries may not use a lot of CUs and having them time out might cause unnecessary disruption. Overall, though, in my experience setting a timeout will stop enough expensive queries to make doing so worthwhile.

[Update: my colleague Akshai Mirchandani has just reminded me that you can also set the Query Timeout at the workspace level as a Server Property using SQL Server Management Studio, as detailed here. The property is called ServerTimeout. This gives you more flexibility than setting it for the whole capacity.]