Since I wrote about the Snowflake query tags generated by Power BI earlier this year, one important piece of functionality has been added: an OperationID that allows you to link a SQL query in Snowflake to events in Workspace Monitoring and the Capacity Metrics app. Let’s see some examples.
I created a DirectQuery semantic model connected to Snowflake in a workspace with Workspace Monitoring enabled and ran a report. Looking in the Snowflake monitoring page I clicked on a SQL query that I knew had been generated by Power BI and found the query tag:
The ActivityId in the query tag is the OperationId of the query, and as you’ll know if you have read my recent posts, you can use the OperationId to get more details on the DAX query that generated this SQL query in Workspace Monitoring. Here’s the KQL I used to query Workspace Monitoring:
SemanticModelLogs
| where Timestamp > ago(1hr)
| where OperationId == "377da3e0-900c-474f-aca2-c6bb6cd4d5a6"
| project Timestamp, OperationName, EventText, DurationMs
And here are the results, showing the events associated with this DAX query:
Once you’ve found the events associated with a DAX query in Workspace Monitoring you can then get all kinds of other useful information, such as the IDs of the report and visual that generated the DAX query which are found in the ApplicationContext column in the SemanticModelLogs table.
Here’s more information about this DAX query from the new Timepoint Detail (Preview) page in the Capacity Metrics App, with the OperationId column showing the same ID from the query tag (see here for more details on this feature of the Capacity Metrics App):
This trick also works for Import mode refreshes. Here’s the query tag from Snowflake for a SQL query generated by Power BI during the refresh of an Import mode semantic model:
Again, you can use this ID to query Workspace Monitoring to get all the activity associated with the refresh operation:
SemanticModelLogs
| where Timestamp > ago(1hr)
| where OperationId == "8f552c8e-4f7c-4376-b663-198f7f310d09"
| project Timestamp, OperationName, EventText, DurationMs
And again you can find the OperationId for the refresh in the Timepoint Detail (Preview) page of the Capacity Metrics App (you can search for an ID in the Operation ID slicer on this page too):
This is going to make it a lot easier to troubleshoot slow DirectQuery reports or Import mode refreshes when you’re using Snowflake as a source.
[Thanks to my colleague Thierry Houy for this information]
Since the November 2024 Power BI release blog post announced that queries sent to Snowflake by Power BI include a query tag I’ve had a lot of questions from people who couldn’t see this happening or wanted to know what the query tags contained, so in this blog I thought I would outline the current status.
The query tagging feature for the Power BI Snowflake connector actually didn’t get released in November 2024 and even now, in April 2025, it’s only available for DirectQuery connections and Import mode refreshes that use the V1.0 connector (the V2.0 connector will support query tags soon). Here’s an example of what a query tag looks like for a SQL query generated by Power BI from a DirectQuery semantic model:
At the time of writing only SQL queries sent from the Power BI Service contain query tags, not those sent from Power BI Desktop. Also there is no way to customise the contents and unlike SQL queries sent to SQL Server-related sources there is no information on the report or visual that generated the SQL query. In the future some of these limitations may go away.
I’m a big fan of Phil Seamark’s “visualise your refresh” pbix file, which allows you to analyse the different operations of an Import mode semantic model refresh operation (similar functionality has now been built into Semantic Link Labs now too). In particular I like how it allows you to visualise operations inside a refresh as a Gantt chart, which makes it easy to see how much parallelism you’re getting when you refresh – and as I discussed here, the amount of parallelism can have a massive effect on how long a refresh takes. However Gantt charts are just one way to visualise this data and after one of my colleagues showed me the KQL range() function I realised there was a different way to approach this problem using KQL and Fabric Workspace Monitoring.
Here’s the KQL query I came up with to find the number of partitions that are refreshing in parallel at any given point in time when a semantic model is refreshed:
SemanticModelLogs
//Filter by Timestamp for performance
| where Timestamp > ago(1h)
//Filter by the OperationId of the refresh
| where OperationId == "insert OperationId of refresh here"
//Only get the events fired when partition refresh finishes
| where OperationDetailName =="Process" and OperationName =="ProgressReportEnd"
//Find the start time of the partition refresh
| extend StartTime = datetime_add("Millisecond", -1 * DurationMs, Timestamp)
//Create a list of all the seconds where the refresh was in progress
| extend StartSecond = range(bin(StartTime,1s), bin(Timestamp,1s), 1s)
| mv-expand StartSecond
//Aggregate the data and get the number of partitions being refreshed in each one-second bin
| summarize PartitionsBeingRefreshed=count() by todatetime(StartSecond)
//Sort in ascending order by one-second bins
| order by StartSecond asc
What this query does is:
Filter the events in the SemanticModelLogs table to just those for the refresh whose OperationId you enter
Filter again so you only get the ProgressReportEnd events that are fired when a partition refresh finishes
Calculates the start time of the partition refresh by taking the finish time and subtracting the duration of the refresh
Creates a table with one row for each second that each partition refresh was in progress
Aggregates this table so you get the number of partition refresh operations that were in progress for each one second bin
All events associated with a single semantic model refresh can be identified by the value in the OperationId column in the SemanticModelLogs table in Workspace Monitoring. To find it, just run a KQL query on the SemanticModelLogs table to get all the events from the time period when your refresh took place; it will be easy to spot the events associated with the refresh by looking at the contents of the EventText column, so all you need to do is copy the value from the OperationId column from one of these events.
I refreshed a semantic model with six dimension tables and one fact table containing ten partitions and here’s what the query above showed when visualised as a line chart in a KQL Queryset:
As you can see, at first four partitions are refreshing in parallel; this quickly drops and then rises, and for the first half of the refresh six partitions are refreshing in parallel. At about the two-thirds point this drops to four partitions refreshing in parallel and as they complete the line chart drops to one. Since six partitions refreshing in parallel is the maximum I allowed for this refresh you can see it’s reasonably efficient.
To get more detail on what is being refreshed, here’s a variation of the KQL query above that doesn’t perform the final aggregation:
This data can then be used in a Power BI report and visualised as a stacked bar chart:
The advantage of this is that you can see which partitions are refreshing when: in this case you can see that it was the partitions from the dimension tables that refreshed first, followed by the partitions from the fact table.
If you’re not using Fabric Workspace Monitoring it should be easy to adapt the queries above to work with Power BI Log Analytics integration instead – it’s just a matter of changing a few column names.
There’s one problem I can think of with this approach: if there are multiple small partitions being refreshed that take under a second there’s a chance that you’ll see inflated parallelism numbers in the results at some points. I don’t think that’s a major problem though and overall I think this is quite a useful way to understand how much parallelism you’re getting during a refresh.
[Thanks to Matthew Farrow for the inspiration here – check out his excellent, detailed series of articles on LinkedIn starting here about understanding Fabric capacities and billing]
In my recent posts on the Command Memory Limit error and the partialBatch mode for Power BI semantic model refresh, I mentioned that one way to avoid memory errors when refreshing large semantic models was to run use refresh type clearValues followed by a full refresh – but that the downside of doing this was that your model would not be queryable until the full refresh had completed. Immediately afterwards some of my colleagues (thank you Alex and Akshai) pointed out that there was in fact a way to ensure a model remained queryable while using this technique: using Semantic Model Scale Out. How? Let me explain…
Problem Recap
As I described here, when you run a full refresh on a semantic model that already has data loaded into it, the original version of the model stays in memory while the new data is loaded in, which means that the overall amount of memory required for the refesh is around double the amount needed to store the model. Since there are limits on the amount of memory that a model can use which vary depending on whether you’re using a capacity or not and what size capacity you’re using, if you have a large model then you might find refresh fails with a memory error. Running a refresh of type clearValues first removes all the data from the model, reducing its memory usage, and improves the chances of a full refresh using less than the allowed maximum amount of memory and therefore succeeding. Removing all the data from your model, though, means that your reports won’t show any data until you do a full refresh.
How can Semantic Model Scale Out help?
Semantic Model Scale Out (also known as Query Scale Out) is primarily a performance feature: by creating multiple replicas of your semantic model, one which is used for refreshes and one or more that are used to answer queries, it prevents refreshes from interfering with report performance and allows report queries to be distributed over multiple physical nodes. All of these replicas of the semantic model need to be kept in sych, which means that when the model that is used for refresh has new data loaded into it, all the other versions of the model need to be replaced by this new version. By default this synchronisation happens automatically but you can also turn this off and control the synchronisation manually.
As a result, if you turn on Semantic Model Scale Out and turn off automatic synchronisation, you can run a refresh of type clearValues to clear the data from your model and then run a full refresh without affecting your end users – who won’t see the new data until you do a manual synchronisation.
Example
Let’s see a simple illustration of how this works. Using the same semantic model I used in this post, hosted on an F64 capacity, I created a report with a single card visual:
With Scale Out turned off, I ran a refresh of type clearValues. This was very fast and after it had finished – as you would expect – the card visual showed a blank value because all of the data had been cleared from the model:
I then ran a full refresh and when it completed the card showed data again.
Next, I turned on Scale Out in the settings pane of the semantic model:
…and turned off automatic replica synchronisation using the script here.
My colleague Michael Kovalsky has very kindly added functionality to Semantic Link Labs to handle manual replica synchronisation, which means it was extremely simple to run the refreshes and synchronise replicas from a Fabric notebook. After installing Semantic Link Labs:
%pip install semantic-link-labs
…I used the following Python code in a notebook cell to run a refresh of type clearValues, followed by a full refresh, followed by a manual replica sychronisation:
import sempy_labs as labs
WorkspaceName = "Model Memory Tests"
SemanticModelName = "ModelMemoryDemo3"
# run a refresh of type clearValues first
labs.refresh_semantic_model(dataset=SemanticModelName, workspace=WorkspaceName, refresh_type="clearValues")
# then a refresh of type full
labs.refresh_semantic_model(dataset=SemanticModelName, workspace=WorkspaceName, refresh_type="full")
# then a manual replica sync
labs.qso_sync(dataset=SemanticModelName, workspace=WorkspaceName)
And as expected the card visual shown above showed data all through the refreshes.
Continuing my series on Power BI model memory errors (see part 1 and part 2), in this post I will look at the Command Memory Limit which restricts the amount of memory that XMLA commands like Create, Alter and most importantly Refresh can use.
If you’ve ever been told that your semantic model should consume less than half the amount of memory available to it because memory consumption can double during a full refresh, then that is because of the Command Memory Limit. Every time a model is refreshed in Power BI, that refresh is initiated by running a Refresh command. During the refresh a copy of the model is created in the background and it is the copy that is refreshed; when the refresh is completed, Power BI deletes the original version of the model and replaces it with the copy. While the refresh is in progress, the memory consumed by this copy of the model and all the operations needed to load data into it (including any Power Query queries used to get data from your data sources and to transform that data) is associated with the Refresh command. The Command Memory Limit specifies how much memory the Refresh command is allowed to use.
The good news is that there is an excellent, detailed explanation of the Command Memory Limit in the docs here which I recommend you read before continuing:
What it says is that the amount of memory that XMLA commands like Refresh can use is the maximum allowed size for a semantic model for the capacity you’re using (as documented in the table here in the Max Memory column) minus the amount of memory the semantic model is using when the command starts.
Let’s look at an example of an error caused by exceeding the Command Memory Limit.
I created an Import mode semantic model whose total size was 3.3GB, and which consisted of a single partitioned table with 20 columns, each of which contained random decimal numbers. I refreshed this model on an F64 capacity where the maximum allowed memory per model is 25GB and the refresh succeeded.
How much memory did the Refresh command use? There are two ways to find out. For a while now you have been able to get the approximate peak memory usage during a refresh (along with the approximate peak memory usage just for Power Query queries, which is a subset of this figure) from the Command End event associated with that refresh in Profiler and Log Analytics; I blogged about this here. However the new Execution Metrics event in Profiler and Log Analytics makes the same information even easier to extract. Here’s what the Execution Metrics event for the refresh looked like in this case:
[Note: if you’re looking in Profiler or Log Analytics you’ll see a lot of Execution Metrics events. Typically the Execution Metrics events for a refresh will be generated immediately after the Command End event for that refresh, but to be sure you should look for matching values in the RequestId column (in a Profiler trace) or the XmlaRequestId column (in Log Analytics) to associate an Execution Metrics event with a Command End event.]
The metric to look at above is approximatePeakMemConsumptionKB and it shows the refresh used about 6,074,967KB or 5.8GB at its peak – a lot more than double the size of the model before or after the refresh. I designed the model specifically for this to happen (20 columns of random decimal numbers is not easy to compress) and in most cases the memory usage will be lower relative to the size of the model.
I then scaled the capacity down to an F16 which only has a memory limit of 5GB for semantic models and refreshed again. As you would expect, the refresh failed with the following error:
Resource Governing: This operation was canceled because there wasn’t enough memory to finish running it. Either reduce the memory footprint of your dataset by doing things such as limiting the amount of imported data, or if using Power BI Premium, increase the memory of the Premium capacity where this dataset is hosted. More details: consumed memory 1780 MB, memory limit 1779 MB, database size before command execution 3340 MB. See https://go.microsoft.com/fwlink/?linkid=2159753 to learn more. The current operation was cancelled because another operation in the transaction failed.
This is the error message associated with hitting the Command Memory Limit (the associated error number is -1052901373). What this message is saying is that the model was consuming 3340MB (3.3GB) before the refresh started, then the refresh itself reached a maximum of 1780MB (1.7GB) but at that point it was cancelled because 5GB-3.3GB=1.7GB. Here’s the Execution Metrics data for the refresh Command:
So is it even going to be possible to refresh this model on an F16 capacity given that refreshing it requires 5.9GB? Well the most memory efficient way of refreshing a semantic model without changing it is to clear all the data out from it first, then refresh just the data in each partition in each table one at a time, then do a recalc, all in separate transactions – which is exactly what the partialBatch commit mode of the Enhanced Refresh API (which I blogged about recently here) does if you set its max_parallelism property to 1. So I tried this – and it failed again ☹️. Unsurprisingly the first refresh operation of type ClearValues ran successfully and peaked at 0.01GB of memory, after which the size of the model in memory would have been negligible. Then the refresh of type DataOnly for first partition in the table was successful but the refresh operation peaked at 3.4GB. The second partition refresh of type DataOnly then failed; I assume it would have also peaked at 3.4GB but the error message told me that the model was already 2.4GB in size when the refresh started so the memory limit for this refresh was 2.6GB.
If you’re hitting the Command Memory Limit and you don’t want to increase the size of your capacity but you are willing to make changes to your model, then there could be several ways to reduce the amount of memory used during a refresh – but in order to understand which method will work you will need to know what is using memory during the refresh and that isn’t easy. Using the partialBatch commit option with the Enhanced Refresh API does at least split the refresh out into its constituent parts and makes it easier to see at which stage the refresh fails. In this case refresh failed at the DataOnly stage so creating more, smaller partitions could help. Following the steps for reducing model size given in blog posts like this one by Nikola Illich will also help a lot to reduce memory consumption during a refresh. Other common culprits for high memory consumption during a refresh include calculated columns and tables, or Power Query queries that buffer large amounts of data in memory (for example because of transformations that sort or aggregate data and which do not fold). Therefore following Roche’s maxim and doing all your transformations and creating the equivalent of your calculated columns and tables in your data source, before the data is loaded into Power BI, will help. If you do have to use calculated columns and tables then you should look at tuning your DAX expressions so they use less memory. I’ll look at some examples of how to tune your Power Query queries or DAX expressions to reduce memory in future posts.
In the Power BI Service, Import mode models are stored offline and paged into memory only when they are needed – for example because someone runs a report that uses the model as its source. As discussed in my last post, though, there’s a limit on the amount of memory that a model can use which varies by the size and type of capacity you’re using using. There’s also an extra complication in that Import mode models are stored offline in a compressed format that means the Power BI Service doesn’t know exactly how much memory will be needed if the entire model needs to be held in memory. As a result there is an additional limit enforced on the size of the offline copy of Import mode models to ensure they don’t use too much memory when they are eventually paged in.
This limit can be configured in the Admin Portal by setting the Max Offline Semantic Model Size property (still shown as the Max Offline Dataset Size in the UI at the time of writing):
The default value of this property is 0, which means that the limit will be set to the maximum allowed value for the capacity SKU you’re using. Different maximum limits are enforced if you have turned on the Large semantic model storage format option for your model and if you haven’t. The maximum limits for the Large model format are the same as the limits on the maximum amount of memory that a model can use, as listed in the table here; for the Small model format they are published here. You can also set the property to a value that is lower than the allowed limit if you want to control the size of the models that your developers can publish – smaller models generally use fewer CUs during refresh or querying, which could help reduce the load on your capacity.
To test what happens when you exceed this limit I created a new F16 capacity (which has a maximum offline model size of 5GB), and then created a workspace on it containing an Import mode Power BI semantic model using the Large model format that was 3.3GB in size. I then scaled my capacity down to an F2 and then tried running a DAX query against the model. The Max Offline Semantic Model Size limit for an F2 capacity for models with the Large model format enabled is 3GB, which is less than the size of the model I had created. As a result the query returned the following error message:
Database ‘xyz’ exceeds the maximum size limit on disk; the size of the database to be loaded or committed is 3500853108 bytes, and the valid size limit is 3221225472 bytes. If using Power BI Premium, the maximum DB size is defined by the customer SKU size (hard limit) and the max dataset size from the Capacity Settings page in the Power BI Portal.
If you’re getting this error then you need to reduce the size of your model, for example by reducing the number of rows in your tables, removing unused columns, or reducing the number of distinct values in the columns you keep. You can find more suggestions on how to reduce model size here.
[Thanks to Akshai Mirchandani for answering my questions on this topic]
You probably know that semantic models in Power BI can use a fixed amount of memory. This is true of all types of semantic model – Import, Direct Lake and DirectQuery – but it’s not something you usually need to worry about for DirectQuery mode. The amount of memory they can use depends on whether you’re using Shared (aka Pro) or a Premium/Fabric capacity, and if you’re using a capacity how large that capacity is. In Shared/Pro the maximum amount of memory that a semantic model can use is 1GB; if you are using a capacity then the amount of memory available for models in each SKU is documented in the table here in the Max Memory column:
What counts as “memory usage” though? More importantly, how can you breach this limit and what do all of the different memory-related error messages that you might see mean? In this series I will try to answer these questions, and in this post I will look at one particular error you see when your model needs to use more memory than it is allowed to.
First of all it’s important to understand that the amount of memory used by a semantic model is not the same as the amount of data “in” the model. The diagram below shows how model memory usage can be broken down. The data in the columns and tables of your model, along with supporting objects like relationships (represented by the blue box in the diagram below) makes up just one part of the overall model memory usage. In addition, more memory is needed to store data associated with row-level security, user sessions, caches and so on (represented by the orange box in the diagram below).
Both Import mode and Direct Lake models can page data in and out of memory as required, so the whole model may not be in memory at any given time. However, in order for a query to run, the data it needs must be in memory and cannot be paged out until the query has finished with it. Therefore out of all the memory consumed by a semantic model, at any given time, some of that memory is “evictable” because it isn’t in use while some of it is “non-evictable” because it is being used. Evictable memory may be paged out of memory for a variety of reasons, for example because the model is nearing its allowed memory limit.
Queries that are running on the model (the purple boxes in the diagram above) also consume memory. Each query has a limit on the amount of memory it can use – I mentioned the Query Memory Limit in this post but I will revisit it later on in this series – but the memory used by queries does not contribute directly to the overall memory use of a semantic model. However a query that is running will force parts of the model to be in memory for a certain amount of time, and this memory will be non-evictable while in use.
In summary then, the total amount of memory used by a semantic model is made up of two groups:
The data in the tables in your model (the blue box above)
Supporting data for RLS security roles, sessions and caches (the orange box above)
When the sum of these two groups exceeds the total amount of memory allowed for your model, and no data can be evicted from memory to reduce this sum, then you’ll get an error.
To illustrate this I created a new F2 capacity, which has a 3GB limit on the amount of memory used by a semantic model, loaded a table (called SourceData) with 3.5 million rows of random numbers stored as text into a Lakehouse, then created a new custom Direct Lake semantic model on it. I set the Direct Lake Behavior property on the model to “Direct Lake only” to prevent fallback to DirectQuery mode.
After creating the model I used DAX Studio’s Model Metrics feature with the “Read statistics from data” option turned off to find the amount of data stored in memory (ie the blue box value).
Unsurprisingly, at this stage, the size of the model was very small: only 8KB.
I then turned the “Read statistics from data” option on, knowing that this would force data to be paged into memory. This showed the total potential size of the model to be 4.25GB:
I was initially confused by this because this is already well over the 3GB limit, but it was pointed out to me that what is probably happening is that DAX Studio runs a number of DMV queries to get the data needed to calculate this value and when this happens different parts of the model are paged in and out of memory. It was certainly very slow for DAX Studio to calculate the Model Metrics when I did this which fits with the paging in/out theory.
Finally, I ran a simple DAX query to get the top 10 rows from the SourceData table:
EVALUATE TOPN(10, SourceData)
This query ran for about ten seconds and then failed with the following error message:
Resource Governing: We cannot complete the requested operation because there isn’t enough memory (consumed memory 4620 MB, memory limit 3072 MB). Either reduce the size of your dataset, such as by limiting the amount of in-memory data, or host the dataset on a Fabric or Premium capacity with a sufficient memory size. See https://go.microsoft.com/fwlink/?linkid=2159753 to learn more.
[The error code associated with this message is 0xC13E0006 or -1052901370]
This is the error that you get when your model needs to use more memory than it is allowed to use for the capacity SKU it is running on. The query references every column from the only table in the model, which means the whole table – which is the whole model – would have to be paged in to memory for the query to run, but the whole model requires more memory than is available on an F2 capacity.
If you aren’t getting this exact error message then something slightly different might be happening. In future posts in this series I will look at some of these other errors including the query memory limit and the command memory limit.
[Thanks to Marius Dumitru and Akshai Mirchandani for the information in this post]
Update: Now this series is concluded, here are the other posts in this series:
I recently took part in a webinar with Denny Lee, Liping Huang and Marius Panga from Databricks on the subject of best practices for using Power BI on Databricks. You can view the recording on LinkedIn here:
My section at the beginning covering Power BI best practices for Import and DirectQuery doesn’t contain any new information – if you’ve been following the DirectQuery posts on this blog or read the DirectQuery guidance docs here and here then there won’t be any surprises. What I thought was really useful, though, was hearing the folks from Databricks talk about best practices on the Databricks side and this took up the majority of the webinar. Definitely worth checking out.
In my last post but one I showed how you could create a DGML file that contains detailed information on what happens during a Power BI Import mode refresh – the “job graph” – and visualise that in Visual Studio or VS Code, to help with performance tuning. In my last post, I explained the concepts of blocking and waiting which are key to understanding this data. In this post I’ll share and describe a Dataflow template that extracts the data contained in the DGML file to a table for other types of analysis – which turns out to be quite simple, because DGML is an XML-based format which is easy to work with in Power Query and Dataflows.
You can download the .pqt template file here (for more information on how to use templates see here). To use it, create a new Dataflow Gen2 and import the template file. You’ll see the following queries:
Next, change the values of the three parameters to contain the IDs of the refresh whose DGML file you want to load (the notebook in this post that creates the DGML file uses the refresh ID as the file’s name) and the workspace and lakehouse where that file is stored. Finally, set output destinations on either the RefreshJobs and/or RefreshJobsPivoted queries. The RefreshJobs query returns a table with one row per refresh job:
This gives you all the data in the DGML file that was visible using a DGML viewer but in table form. For each job you get the CreatedAt, RunnableAt, StartedAt and FinishedAt datetimes, the Blocked, Wait and Running durations, the Slot ID, the CPU used and whether the job is on the critical path.
The RefreshJobsPivoted query gives you exactly the same data but instead of giving you one row per job, you get three: one row for the blocked, waiting and running phases of each job, with the start and end times for each phase and the duration. This may be a more convenient format for visualisation and analysis:
The Links query gives you the dependencies between the jobs:
Having loaded all this data into a Lakehouse you can now build a report on it. As I said, I’m sure there’s a way of visualising all of this data in a way that shows all the durations and dependencies, but even a basic table report on the RefreshJobs table like this is really useful because it shows you which jobs were blocked, which ones had to wait, and which ones took a long time to run:
In this example (the Dependencies example from my previous post, but where the refresh has maxParallelism set to 1, so there is blocking as well as waiting) you can see that refreshing table X took 60 seconds, that the refresh for table Y had to wait for 60 seconds before it could start and took 10 seconds, and that refresh for table XY was blocked for 70 seconds before it could start. So, if you wanted to make this refresh run faster, you would want to understand why table X was so slow and also look at increasing the amount of parallelism so Y did not have to wait.
Following on from my previous post showing how you can visualise the job graph for a Power BI Import mode semantic model refresh, I this post I will look at how you can interpret what the job graph tells you – specifically, explaining the concepts of blocking and waiting. As always, simple examples are the best way of doing this.
Blocking
Blocking occurs when one job can only start when one or more other jobs have completed because there is a dependency between them. Consider the following semantic model:
X and Y are tables that contain a single numeric column. X takes 1 minute to load while Y takes 10 seconds to load (I forced this delay using technique I blogged about here). XYUnion is a DAX calculated table that unions all the rows from X and Y with the following definition:
XYUnion = UNION(X,Y)
As you can imagine, the job that refreshes XYUnion can only start once the jobs that refresh both X and Y have finished; XYUnion will be blocked until both X and Y have refreshed. Here’s what the job graph for the refresh of this semantic model looks like:
At the centre is the job “Process Partition XYUnion[XYUnion]” which refreshes the calculated table XYUnion. The arrows going from this job to the jobs “Process Partition X[X]” and “Process Partition Y[Y]”, which refresh tables X and Y, show that this job depends on those two other jobs.
Hovering over the “Process Partition XYUnion[XYUnion” job shows the following popup:
There are four datetime values here: CreatedAt, RunnableAt, StartedAt and FinishedAt. There are also three durations:
Block is the elapsed time between CreatedAt and RunnableAt, and is the amount of time elapsed before all the jobs this job depends on were completed. Anything other than a zero here means that blocking has occurred.
Wait is the elapsed time between RunnableAt and StartedAt. A job becomes runnable when all the jobs it depends on have completed but even it still may not be able to start because Power BI only allows a certain number of jobs to refresh in parallel (see this post for more details and how to control the amount of parallelism). Waiting is described in the example below.
Run is the elapsed time between StartedAt and FinishedAt and is the amount of time the job itself took to run.
In this case you can see that the value for Block for XYUnion is 1 minute: X and Y have no preceding jobs so they kick off at the same time, X takes 1 minute to run and Y takes 10 seconds, so it is 1 minute before XYUnion can run. The popups for X and Y show 1 minute and 10 seconds respectively for Run, as you would expect:
One last thing to mention: in the full job graph diagram above you’ll see that certain nodes are highlighted in red. That’s because they are on the critical path, which is documented here; it’s the chain of jobs that dictates the overall length of the refresh, so if you want to make your refresh faster then you need to tune the jobs on the critical path. In this case the critical path goes through X to XYUnion: if you wanted the whole semantic model to refresh faster you would need to tune the refresh for either X or XYUnion; tuning the refresh for Y would make no difference to the overall semantic model refresh time.
Waiting
As I already mentioned, there is a limit on the number of jobs within a refresh that can run in parallel. The maximum number of jobs is represented by the number of “slots” – one slot can only run one job at a time – and in the screenshots of the popups above you can see each job has a slot number. If there are more jobs that could be run than there are slots available at a given time then some jobs have to wait for a slot.
Here’s another example: a semantic model with three tables, A, B and C, which each take 30 seconds to refresh.
There are no dependencies between the tables so in theory each of these three tables could refresh in parallel. However if you refresh with maxParallelism set to two then only two of the three can refresh at any one time. Here’s the job graph for a refresh that does that:
As you can see the critical path goes through the refresh jobs for table A, and hovering over the “Process Partition A[A]” job shows the following:
While this job was not blocked at all because there are no jobs it depends on, it had to wait 30 seconds for a slot to become available; it eventually ran in slot 0. Hovering over the nodes for “Process Partition B[B]” and “Process Partition C[C]” shows that they ran in slots 0 and slot 1 respectively and neither of them were blocked or had to wait.
The job graph isn’t always the best way of visualising this type of parallelism; Phil Seamark’s original Power BI report for visualising refreshes has a page which shows the slots and the jobs in them but I think there’s probably a better way of visualising all of this data that shows slots as well as dependencies. Maybe a Deneb visual is the answer? If anyone has ideas I’d be interested to hear them! In any case, the first step to doing this is to extract all of this data from the .DGML file into a table and that’s what I’ll demonstrate how to do in the next post in this series.