Following on from my post two weeks ago about how to get the details of Power BI operations seen in the Capacity Metrics App using the OperationId column on the Timepoint Detail page, I thought it was important to point out that you can do the same thing with TSQL queries against a Fabric Warehouse/SQL Endpoint and with Spark jobs. These two areas of Fabric are outside my area of expertise so please excuse any mistakes or simplifications, but I know a lot of you are Fabric capacity admins so I hope you’ll find this useful.
First of all, how do you find the details of TSQL queries run against a Fabric Warehouse or SQL Endpoint that you see in the Capacity Metrics App? This is actually documented here, but let’s see a simple example. For TSQL queries run on a Fabric Warehouse or SQL Endpoint, the contents of the OperationId column represent the Distributed Statement Id of a query. In the Timepoint Detail page, in the Background Operations table with the OperationId column selected in the Optional Columns dropdown, you can take a single SQL Endpoint Query operation and copy the OperationId value (in this case 5BE63832-C0C7-457D-943B-C44FD49E5145):
…and then paste it into a TSQL query against the queryinsights.exec_requests_history DMV like so:
SELECT distributed_statement_id, start_time, command
FROM queryinsights.exec_requests_history
WHERE distributed_statement_id = '5BE63832-C0C7-457D-943B-C44FD49E5145';
…and you can get the actual SQL query that was run plus lots of other useful information:
For Spark jobs, the OperationId of an operation in the Capacity Metrics App represents the Livy Id of the job. Here’s an operation of type Notebook Run seen in the Background Operations table on the Timepoint Detail page:
In this case the OperationId value of 41c5dc84-534d-4d21-b3fd-7640705df092 of the job matches the Livy Id of the job seen on the Run Details tab in the Fabric Monitoring Hub:
Unfortunately at the time of writing other Fabric workloads do not yet emit an OperationId or, if they do, may not emit an OperationId that can be linked back to other monitoring data. But as always, if things change or I find out more, I’ll let you know.
The key to using Log Analytics and Workspace Monitoring to troubleshoot throttling is the Execution Metrics event. Queries and semantic model refreshes always have an Execution Metrics event associated with them, and you can make the connection between all events for a specific operation using the OperationId column in Workspace Monitoring (as mentioned in my last post) or the XmlaRequestId column in Log Analytics – these columns contain the same information, it’s just that the column name was changed in Workspace Monitoring. When a query or a refresh is delayed because of throttling the Execution Metrics event will include a metric called capacityThrottlingMs, which gives the total amount of time in ms that the query or refresh was delayed before it was allowed to execute.
The documentation for Execution Metrics contains a sample KQL query for Log Analytics that uses Execution Metrics to return aggregated data on throttling but doesn’t explain how it works. Here’s a version of that query rewritten for Workspace Monitoring that returns the individual query and refresh operations that were throttled in the last five days:
let
EventsLast5Days =
SemanticModelLogs
| where Timestamp > (5d)
| where OperationName in ("CommandEnd", "QueryEnd", "DiscoverEnd", "ExecutionMetrics");
let
ExecutionMetricsEvents =
EventsLast5Days
| where OperationName == "ExecutionMetrics"
| project ExecutionMetrics = EventText, OperationId
| extend EM = parse_json(ExecutionMetrics)
| extend capacityThrottlingMs = toint(EM.capacityThrottlingMs)
| project-away EM, ExecutionMetrics
| where capacityThrottlingMs > 0;
let
QueryDiscoverRefreshEvents =
EventsLast5Days
| where OperationName in ("CommandEnd", "QueryEnd", "DiscoverEnd")
| project Timestamp, OperationName, OperationDetailName, QueryText = EventText, DurationMs, CpuTimeMs, OperationId, ExecutingUser;
ExecutionMetricsEvents
| join kind=inner QueryDiscoverRefreshEvents on OperationId
| project-away OperationId1
Here’s how this query works:
The EventsLast5Days variable gets all the events in the last 5 days which have an OperationDetailName of either CommandEnd (which is fired when a refresh finishes), QueryEnd (which is fired when a DAX or MDX query finishes), DiscoverEnd (which is fired when a Discover command, which is used to retrieve metadata from the Power BI engine, finishes) or ExecutionMetrics.
The ExecutionMetricsEvents variable takes the EventsLast5Days variable and filters it down to just the ExecutionMetrics event. It then parses the JSON value containing the metrics in the EventText column, extracts the capacityThrottlingMs metric if it is present, and filters the events down to just those where capacityThrottlingMs is greater than 0.
The QueryDiscoverRefreshEvents variable takes the EventsLast5Days variable and filters it down to just the CommandEnd, DiscoverEnd and QueryEnd events.
Finally there is an inner join between the ExecutionMetricsEvents variable and the QueryDiscoverRefreshEvents variable on the OperationId column, which results in a table of the CommandEnd, DiscoverEnd and QueryEnd events where throttling was present.
A couple of important things to point out:
The duration of an event in Workspace Monitoring, given in the DurationMs column, does not include any delays added as a result of capacity throttling. As a result you can have queries that take 0ms but which were delayed by 20 seconds – something you might not realise if you only look at the DurationMs column.
Having the DAX queries that are run when you display a Power BI report is bad enough, but throttling delays are also applied to Discover commands, and Power BI reports (and indeed every client tool) will run a number of these before they run the queries they needs to render a report. This means that the impact of throttling on report performance can be a lot greater than you might expect.
The new Timepoint Item Detail page in the Capacity Metrics App, in preview at the time of writing, includes a slicer on the OperationId column. This means that you can do the opposite of what I showed in my previous blog post and filter data in the Capacity Metrics App using an OperationId found from a query in Workspace Monitoring. This would allow you to find the percentage of the available CUs on a capacity that a given query or refresh used.
It’s the week of Fabcon Europe and you’re about to be overwhelmed with new Fabric feature announcements. However there is a new blink-and-you’ll-miss-it feature that appeared in the latest version of the Fabric Capacity Metrics App (released on 11th September 2025, version 47) that won’t get any fanfare but which I think is incredibly useful – it allows you to link the Power BI operations (such as queries or refreshes) you see in the Capacity Metrics App back to Workspace Monitoring, Log Analytics or Profiler so you can get details such as the query text.
Let’s say you’re in the Capacity Metrics App in the existing Timepoint Detail page. On the top right hand corner of both the “Interactive operations for time range” and “Background operations for time range” tables there is a dropdown box that allows you to display additional columns. This box now contains an option to display the OperationId column:
After you’ve added this column you’ll see it contains a GUID:
There is also, incidentally, a new page in preview called Timepoint Item Detail (preview) which is reached through the new Timepoint Summary (preview) page and which will eventually replace the Timepoint Detail page. If you haven’t seen this you should check it out: I think it’s a big improvement. This also has a dropdown box that allows you to show the OperationId column in its versions of the “Interactive operations for time range” and “Background operations for time range” tables.
This page also has a dropdown box at the top that allows you to filter operations by OperationId.
The OperationId is a unique identifier for each Power BI operation. Right clicking on this value and selecting Copy/Copy value to copy it:
…means that you can use this value to cross-reference with the log data you find in Workspace Monitoring, Log Analytics or Profiler. For example I have a workspace with Workspace Monitoring enabled and found the following OperationId in the Capacity Metrics App: a7a2d4d4-2a9b-4535-b65a-a0cc0389d821. The following KQL query run on Workspace Monitoring:
let
OperationIdFromCapacityMetrics = "a7a2d4d4-2a9b-4535-b65a-a0cc0389d821";
SemanticModelLogs
| where OperationId == OperationIdFromCapacityMetrics
| where OperationName == "QueryEnd"
| project Timestamp, ItemName, EventText, DurationMs, OperationDetailName
| order by Timestamp
…returns the query text for the DAX query associated with this interactive operation:
There’s a lot of other information you can get by writing KQL queries from Workspace Monitoring (for some examples see here) such as the IDs of the visual and the report that generated the query. If you’re using Log Analytics or Profiler there is no OperationId column – it’s called XmlaRequestId in Log Analytics and RequestId in Profiler – but the same information is available there too.
This is very useful for admins trying to identify why a capacity is overloaded: it means that you can now see the details of expensive queries or refresh operations and understand why they are causing problems. Make sure you upgrade your Capacity Metrics App to the latest version and enable Workspace Monitoring on all important workspaces so you can do this!
[Thanks to Tim Bindas and Lukasz Pawlowski for letting me know about this]
Direct Lake mode in Power BI allows you to build semantic models on very large volumes of data, but because it is still an in-memory database engine there are limits on how much data it can work with. As a result it has rules – called guardrails – that it uses to check whether you are trying to build a semantic model that is too large. But what happens when you hit those guardrails? This week one of my colleagues, Gaurav Agarwal, showed me the results of some tests that he did which I thought I would share here.
Before I do that though, a bit more detail about what these guardrails are. They are documented in the table here, they vary by Fabric capacity SKU size and there are four of them which are limits on:
The number of Parquet files per Delta table
The number of rowgroups per Delta table
The number of rows per table
The total size of the data used by the semantic model on disk
There is also a limit on the amount of memory that can be used by a semantic model, something I have blogged about extensively, but technically that’s not a guardrail.
Remember also that there are two types of Direct Lake mode (documented here): the original Direct Lake mode called Direct Lake on SQL Endpoints that I will refer to as DL/SQL and which has the ability to fall back to DirectQuery mode, and the newer version called Direct Lake on OneLake that I will refer to as DL/OL and which cannot fall back to DirectQuery.
For his tests, Guarav built a Fabric Warehouse containing a single table. He then added more and more rows to this table to see how a Direct Lake semantic model built on this Warehouse behaved. Here’s what he found.
If you build a DL/SQL model that exceeds one of the guardrails for the capacity SKU that you are using then, when you refresh that model, the refresh will succeed and you will see the following warning message in the model Refresh History:
We noticed that the source Delta tables exceed the resource limits of the Premium or Fabric capacity requiring queries to fallback to DirectQuery mode. Ensure that the Delta tables do not exceed the capacity's guardrails for best query perf.
This means that even though the refresh has succeeded, because the model has exceeded one of the guardrails then it will always fall back to DirectQuery mode – with all the associated performance implications.
If your DL/SQL model exceeds one of the guardrails for the largest Fabric SKU, an F2048, then refresh will fail but you will still be able to query the model and the model will again fall back to DirectQuery mode. For his tests, Guarav loaded 52 billion rows into a table; the guardrail for the maximum number of rows in a table for an F2048 is 24 billion rows. The top-level message you get when you refresh in this case is simply:
An error occurred while processing the semantic model.
Although if you look at the details you’ll see a more helpful message:
We cannot refresh the semantic model because of a Delta table issue that causes framing to fail. The source Delta table '<oii>billionrows</oii>' has too many parquet files, which exceeds the maximum guardrails. Please optimize the Delta table. See https://go.microsoft.com/fwlink/?linkid=2316800 for guardrail details.
The DAX TableTraits() function, which another colleague, Sandeep Pawar, blogged about here, can also tell you the reason why a DL/SQL semantic model is falling back to DirectQuery mode. Running the following DAX query on the 52 billion row model:
EVALUATE TABLETRAITS()
…returned the following results:
This shows that the table called billionrows actually exceeds the guardrails for the number of files, the number of rowgroups and the number of rows.
What about DL/OL models though? Since they cannot fall back to DirectQuery mode, when you try to build or refresh a DL/OL semantic model that exceeds a guardrail you’ll get an error and you won’t be able to query your semantic model at all. For example here’s what I saw in Power BI Desktop when I tried to use the 52 billion row table in a DL/OL model:
Something went wrong connecting to this item. You can open the item in your browser to see if there is an issue or try connecting again.
We cannot refresh the semantic model because of a Delta table issue that causes framing to fail. The source Delta table 'billionrows' has too many parquet files, which exceeds the maximum guardrails. Please optimize the Delta table.
All of this behaviour makes sense if you think about it, even though I wouldn’t have known how things work exactly until I had seen it. Some behaviour may change in the future to make it more intuitive; if that happens I will update this post.
[Thanks to Gaurav for showing me all this – check out his podcast and the India Fabric Analytics and AI user group that he helps run on LinkedIn. Thanks also to Akshai Mirchandani and Phil Seamark for their help]
Two years ago I wrote a detailed post on how to do performance testing for Direct Lake semantic models. In that post I talked about how important it is to run worst-case scenario tests to see how your model performs when there is no model data present in memory, and how it was possible to clear all the data held in memory by doing a full refresh of the semantic model. Recently, however, a long-awaited performance improvement for Direct Lake has been released which means a full semantic model refresh may no longer page all data out of memory – which is great, but which also makes running performance tests a bit more complicated.
First of all, what is this new improvement? It’s called Incremental Framing and you can read about it in the docs here. Basically, instead of clearing all data out of memory when you do a full refresh of a Direct Lake model, the model now checks each Delta table it uses to see whether the data in it has actually changed. If it hasn’t changed then there’s no need to clear any data from that table out of memory. Since there’s a performance overhead to loading data into memory when a query runs this means that you’re less likely to encounter this overhead, and queries (especially for models where the data in some tables changes frequently) will be faster overall. I strongly recommend you to read the entire docs page carefully though, not only because it contains a lot of other useful information, but also because you might be loading data into your lakehouses in a way that prevents this optimisation from working.
Let me show you an example of this by revisiting a demo from a session I’ve done at several user groups and conferences on Power BI model memory usage (there are several recordings of it available, such as this one). Using a Direct Lake semantic model consisting of a single large table with 20 columns containing random numbers, if I use DAX Studio’s Model Metrics feature when there is no data held in memory and with the Direct Lake Behaviour setting in DAX Studio’s Options dialog set to ResidentOnly (to stop Model Metrics from loading data from all columns into memory when it runs):
Then when you run Model Metrics the size of each column in the semantic model is negligible and the Temperature and Last Accessed for all model columns are blank:
The, if I run a query that asks for data from just one column (in this case the column called “1”) from this table like this:
Then rerun Model Metrics then the size in memory for that column changes, because of course it has been loaded into memory in order to run the query:
Zooming in on the Model Metrics table columns from the previous screenshot that show the size in memory:
And here are the Temperature and Last Accessed columns from the same screenshot which are no longer blank:
Since the query had to bring the column into memory before it could run, the DAX query took around 5.3 seconds. Running the same query after that, even after using the Clear Cache button in DAX Studio, took about only 0.8 seconds because the data needed for the query was already resident in memory.
OK, so far nothing has changed in terms of behaviour. However if you do a full refresh from the Power BI UI without making any changes to the underlying Delta tables:
And then rerun the Model Metrics, nothing changes and the data is still in memory! As a result the DAX query above still only takes about 0.8 seconds.
So how do you get that worst-case performance again? As mentioned in the docs here, you now need to do a refresh of type clearValues followed by a full refresh. You can’t do a refresh of type clearValues in the Power BI UI though, so the easiest way to do is to use a Fabric notebook and Semantic Link Labs. Here’s how. First install Semantic Link Labs:
%pip install semantic-link-labs
Then use the following code in a notebook cell to do a refresh of type clearValues followed by a full refresh:
import sempy_labs as labs
WorkspaceName = "Insert Workspace Name Here"
SemanticModelName = "Insert Semantic Model Name Here"
# run a refresh of type clearValues first
labs.refresh_semantic_model(dataset=SemanticModelName, workspace=WorkspaceName, refresh_type="clearValues")
# then a refresh of type full
labs.refresh_semantic_model(dataset=SemanticModelName, workspace=WorkspaceName, refresh_type="full")
After doing this on my model, Model Metrics shows that the column called “1” that was previously in memory is no longer resident:
…and the query above once again takes 5 seconds to run.
So, as you can see, if you’re doing performance testing of a Direct Lake model you now need to make sure you do a refresh of type clearValues and a full refresh of your model before each test to ensure no data is resident in memory and get worst-case performance readings, in addition to testing performance on a cold cache and a warm cache.
There are a lot of really cool free, community-developed tools and resources out there for Power BI and Fabric – so many that it’s easy to miss announcements about them. In this post I thought I’d highlight a few that came out recently and which you might want to check out.
Let’s start with the Fabric Toolbox, a collection of tools, samples, scripts and accelerators created and maintained by some of my colleagues here at Microsoft. The most widely-known tool in there is FUAM (Fabric Unified Admin Monitoring), a solution accelerator for monitoring an enterprise Power BI and Fabric implementations. It’s the successor to Rui Romano’s Power BI monitoring solution, which is now deprecated, but it’s a lot richer than that. It’s already been the subject of a Guy In A Cube video though so I hope you’ve already come across it. There are other things in the Fabric Toolbox that should be more widely known though. My fellow CAT Phil Seamark (why doesn’t he blog anymore???) has been busy: a month ago he announced a new Power BI load testing tool (video here) based on Fabric notebooks which is much easier to configure than the previous load testing tool created by the CAT team. He’s also published a sample MCP Server that, among other things, can analyse a semantic model to see whether it follows best practices. Another colleague, Justin Martin, has published tools for auditing semantic models and DAX performance tuning in the toolbox too. Finally, with the deprecation of Power BI Datamarts looming, if you choose to replace them with Direct Lake semantic models based on Fabric Warehouse (although I think 90% of the Datamarts I’ve seen can be replaced with simple Import models) then there’s a migration accelerator here.
Elsewhere, if you’re a hardcore Power BI developer you’ll already know how useful TMDL View in Power BI Desktop is. Rui Romano recently announced that there’s a new gallery of TMDL scripts where you can see what’s possible with TMDL and share your own scripts. For example, there’s a script here that creates a date dimension table from a Power Query query.
Two years ago I blogged about a tool called PBI Inspector that provides rules-based best practices testing for the Power BI visualisation layer, created by yet another Microsoft colleague, Nat van Gulck. Not only is there now a V2 of PBI Inspector, which will be renamed Fab Inspector, but two weeks ago Nat announced a VS Code extension that allows you to write, debug and run rules from VS Code.
Last of all Gerhard Brueckl recently announced V2 of Fabric Studio, an incredibly powerful VS Code extension that acts as a wrapper for the Power BI/Fabric REST APIs. It lets you browse your workspaces and their contents from VS Code and create/update/delete items among other things; Gilbert Quevauvilliers recently wrote a nice blog post showing how you can use it to download any Power BI report from the Service easily.
That’s enough for now. If there are other tools or resources that came out recently that I didn’t mention, please leave a comment!
At the end of my last post I showed how, if you put the definition of a measure in the AI Instructions of your Power BI semantic model, Copilot can use it as a template to create new measures with slightly different logic. That got me wondering – can you do the same thing for DAX queries to solve even more complex problems? It turns out you can.
First of all, why would you want to do this? There are some problems that can be solved relatively easily by writing a DAX query but which are quite hard to solve if you’re limited to writing measures. ABC classification is one example of this. Marco and Alberto have a great article on this problem here; the problem is when you want to increase the number of buckets/clusters/groups you use or change the rules for which transactions go into which buckets and make all this dynamic for end users building reports in Power BI Desktop then your semantic model can get very complicated very quickly. In a DAX query you can solve these problems much more easily, in part because you can define a calculated column at query time and use it to do your grouping. And since Copilot can generate and run DAX queries, it can be taught how to do ABC analysis by giving it an example DAX query and some instructions telling it how the query works. Let’s see an example.
The sample semantic model I’ve been using in this series has a fact table called Transactions where each row represents a real estate property sale in England and Wales. Let’s say you want to group these rows into different groups based on the price paid for the property. With nothing in the AI Instructions for the model, the following prompt:
Do an ABC analysis on the Transactions table, grouping all transactions into three groups named A, B and C based on the Price column where the upper boundary for the first group is £250000 and the upper boundary for the second group is £700000. Filter the transactions to just 1st January 2025.
…will sometimes generate the right result but will also sometimes generate a slightly strange or even incorrect result. Here’s an example where the information in the summary is correct but the DAX query is far from ideal:
EVALUATE
// Create a table with transactions on 1st January 2025 and assign an ABC group based on Price
SELECTCOLUMNS(
FILTER(
'Transactions',
'Transactions'[Date] = DATE(2025, 1, 1) // Filter transactions to 1st January 2025
),
"Price", 'Transactions'[Price],
"ABC Group",
SWITCH(
TRUE(),
'Transactions'[Price] <= 250000, "A", // Group A: Price up to £250,000
'Transactions'[Price] <= 700000, "B", // Group B: Price over £250,000 and up to £700,000
"C" // Group C: Price above £700,000
)
)
ORDER BY
[ABC Group] ASC,
[Price] ASC
This query returns one row per transaction rather than one row per group:
Not something you can give to an end user for sure. Here’s a DAX query that gives me exactly the result I want:
DEFINE
//Create a new calculated column at query time
//to create the groups for ABC classification
COLUMN 'Transactions'[Group] =
//Upper boundary for price for group A
VAR AUpperBoundary = 250000
//Upper boundary for price for group B
VAR BUpperBoundary = 700000
RETURN
//Return a different letter representing a group name
//based on where the value in the Price column sits in
//the boundaries defined
SWITCH(
TRUE(),
//If the price is less than or equal to the variable AUpperBoundary
//then return the value "A"
Transactions[Price]<=AUpperBoundary, "A (<=£250,000)",
//If the price is less than or equal to the variable BUpperBoundary
//then return the value "B"
Transactions[Price]<=BUpperBoundary, "B (>£250,000 and <=£700,000)",
//Otherwise return the value "C"
"C (>£700,000)"
)
//Returns the results of the classification
EVALUATE
SUMMARIZECOLUMNS(
'Transactions'[Group],
//Filter by a given date
KEEPFILTERS( TREATAS( {DATE(2025,1,1)}, 'Date'[Date] )),
"Count Of Transactions", [Count Of Transactions]
)
ORDER BY
'Transactions'[Group] ASC
Here’s what this query returns in DAX Studio:
Putting this query in the semantic model’s AI Instructions with some explanatory text, like so:
The following DAX query does an ABC analysis on the Transactions table, grouping transactions into three groups called A, B and C, for the 1st January 2025. The transactions with the lowest prices always go into group A, then subsequent letters represent higher price ranges. If the user asks for an ABC analysis use this query as a template.
DEFINE
//Create a new calculated column at query time
//to create the groups for ABC classification
COLUMN 'Transactions'[Group] =
//Upper boundary for price for group A
VAR AUpperBoundary = 250000
//Upper boundary for price for group B
VAR BUpperBoundary = 700000
RETURN
//Return a different letter representing a group name
//based on where the value in the Price column sits in
//the boundaries defined
SWITCH(
TRUE(),
//If the price is less than or equal to the variable AUpperBoundary
//then return the value "A"
Transactions[Price]<=AUpperBoundary, "A (<=£250,000)",
//If the price is less than or equal to the variable BUpperBoundary
//then return the value "B"
Transactions[Price]<=BUpperBoundary, "B (>£250,000 and <=£700,000)",
//Otherwise return the value "C"
"C (>£700,000)"
)
//Returns the results of the classification
EVALUATE
SUMMARIZECOLUMNS(
'Transactions'[Group],
//Filter by a given date
KEEPFILTERS( TREATAS( {DATE(2025,1,1)}, 'Date'[Date] )),
"Count Of Transactions", [Count Of Transactions]
)
ORDER BY
'Transactions'[Group] ASC
Means that the prompt above not only returns what you would expect:
…but more importantly, variations on the prompt such as:
Do an ABC analysis on the Transactions table, grouping all transactions into four groups named A, B, C and D based on the Price column where the upper boundary for the first group is £300000, the upper boundary for the second group is £750000 and the upper boundary for the third group is £900000. Filter the transactions to just 16th January 2025.
…take the template query and modify it as you would expect:
Note that the prompt has not only changed the date to filter by but increased the number of groups from three to four, and that Copilot has modified the DAX query to handle this – which I think is very impressive. Here’s the DAX query Copilot generated:
// DAX query generated by Fabric Copilot with "Group all transactions from the Transactions table into four groups named A, B, C, and D based on the Price column, where group A includes transactions with Price up to £300,000, group B includes transactions with Price above £300,000 and up to £750,000, group C includes transactions with Price above £750,000 and up to £900,000, and group D includes transactions with Price above £900,000. Filter the transactions to just 16th January 2025. Show the count of transactions in each group."
DEFINE
// Create a new calculated column at query time
// to classify transactions into groups A, B, C, and D based on the Price column
COLUMN 'Transactions'[Group] =
// Define the upper boundaries for each group
VAR AUpperBoundary = 300000
VAR BUpperBoundary = 750000
VAR CUpperBoundary = 900000
RETURN
SWITCH(
TRUE(),
'Transactions'[Price] <= AUpperBoundary, "A (<=£300,000)",
'Transactions'[Price] <= BUpperBoundary, "B (>£300,000 and <=£750,000)",
'Transactions'[Price] <= CUpperBoundary, "C (>£750,000 and <=£900,000)",
"D (>£900,000)"
)
EVALUATE
SUMMARIZECOLUMNS(
'Transactions'[Group],
// Filter transactions to include only the 16th January 2025
KEEPFILTERS( TREATAS( { DATE(2025,1,16) }, 'Date'[Date] ) ),
"Count Of Transactions", [Count Of Transactions]
)
ORDER BY
'Transactions'[Group] ASC
Tools that convert natural language to SQL, like Fabric Data Agents, all allow you to specify example SQL queries like this so it’s hardly a surprise that the same approach applies to DAX. What’s interesting is the type of problems that giving Copilot templates of DAX measures and queries allows you to solve, and I’m only just starting to think of the possibilities.
Continuing my (already very long) series on what information you should be adding to the AI Instructions of your semantic model and why, in this post I’ll show you the benefits of adding the DAX definitions of your measures.
A really common question from end users viewing a Power BI report is “how is this measure calculated?”. As a result I have seen model developers use techniques like this to display either a text description of how the measure works in a report or the actual DAX definition. It is therefore not a surprise that if end users are using Copilot they will ask the same question. Unfortunately Copilot cannot – or rather should not, at the time of writing – be able to to see the definitions of the measures in your model. Most of the time if a user asks to see how a measure is defined then Copilot will say that it can’t show the definition:
[Argh, yes I know it’s calling a measure a calculated column]
…although sometimes it does seem to be able to get the definition by writing a DAX query – but I have also seen it hallucinate and come up with a plausible looking definition which isn’t the actual definition. Anyway, it certainly can’t show the definition of a measure or a description of how it works reliably.
Adding all the measure definitions to the model’s AI Instructions mostly solves this problem. TMDL View makes it easy to get all the measure definitions in a semantic model in Power BI Desktop and you can copy/paste them from there into the AI Instructions.
In the semantic model that I’ve been using throughout this series I added some extra measures and then copied their DAX definitions and their descriptions into the AI Instructions. Here are the AI Instructions:
##Definitions and descriptions of the measures in this model
If a user asks how a measure is defined, asks how a measure works or asks how a measure is calculated, ignore any previous instructions about displaying measure definitions from the model and show the definition given here.
All measures that return currency values do so in Pounds Sterling and should be formatted with a £ sign.
##Average Price Paid
AVERAGE('Transactions'[Price])
##Count Of Transactions
COUNTROWS('Transactions')
##New Build Average Price Paid
Gets the value of the Average Price Paid measure where the New column equals "Y"
CALCULATE([Average Price Paid], KEEPFILTERS('Transactions'[New]="Y"))
##Tax Paid
Different tax rates are levied on new build and non-new build properties
A 10% tax is levied on the sale of new build properties
A 5% tax is levied on the sale of properties that are not new builds
(CALCULATE(SUM('Transactions'[Price]), KEEPFILTERS('Transactions'[New]="Y")) * 0.1)
+
(CALCULATE(SUM('Transactions'[Price]), KEEPFILTERS('Transactions'[New]="N")) * 0.05)
Note in particular the Tax Paid measure – it has some fairly complex logic that bears no relation to any actual tax in the UK, so I knew there was no way Copilot would be able to guess how it was defined.
With these AI Instructions in place, the prompt:
how does the tax paid measure work?
…shows a helpful summary:
[Although I admit sometimes Copilot still does reply saying that it can’t display the definition of a measure from the model, which isn’t ideal – if I work out how to stop this happening I’ll update this post]
Even more impressively, since Copilot knows the definition of the measure, it is able to answer more complex questions like this:
show me what the tax paid by property type name would be if the tax on new build properties was increased to 11%
Here’s the DAX query generated, which shows that Copilot has modified the definition of the measure correctly:
// DAX query generated by Fabric Copilot with "Show the tax paid by Property Type Name if the tax on new build properties was increased to 11%."
EVALUATE
// Summarize the tax paid by Property Type Name with increased new build tax rate (11%)
SUMMARIZECOLUMNS(
'PropertyTypes'[Property Type Name],
"Tax Paid",
// New build tax (rate increased to 11%) plus non-new build tax (rate unchanged at 5%)
(CALCULATE(SUM('Transactions'[Price]), KEEPFILTERS('Transactions'[New] = "Y")) * 0.11) +
(CALCULATE(SUM('Transactions'[Price]), KEEPFILTERS('Transactions'[New] = "N")) * 0.05)
)
ORDER BY
'PropertyTypes'[Property Type Name] ASC
I’m spending a lot of time with organisations testing Power BI Copilot at the moment, and something I hear a lot is that Copilot works well in Desktop but when you publish your model to the Power BI Service the results are a lot more inconsistent. One reason why is because of how updates to AI Instructions are applied after you publish your semantic model.
Let’s see an example of this. Consider the following semantic model consisting of a single table with two measures, Sales Amount and Profit Amount:
The semantic model has the following AI Instructions applied:
//Instructions V1
xyz means the measure Sales Amount
The instructions here don’t make much sense, but using a meaningless term like “xyz” makes it easier to test whether Copilot is using an instruction or not.
In Power BI Desktop, the following Copilot prompt returns the results you’d expect with xyz understood as Sales Amount:
show xyz
If you publish this model to an empty workspace in the Power BI Service then this prompt returns the same correct result.
[By the way, the message “Copilot is currently syncing with the data model. Results may be inconsistent until the sync is finished” will be the subject of a future blog post – it’s not connected to what I’m describing in this post, it relates to how Copilot needs to index the text values in your semantic model, which is a separate process]
So far so good. Going back to Power BI Desktop, changing the AI Instructions like so:
//Instructions V2
xyz means the measure Sales Amount
kqb means the measure Profit Amount
…then closing and reopening the Copilot pane in Desktop and entering the prompt:
show kqb
…also returns the result you would expect, with kqb understood as Profit Amount
However, if you publish the same model up to the same workspace as before – so you are overwriting the previous version of the model in the Service – and then use the same prompt immediately after publishing:
…Copilot returns an incorrect result: it does not understand what “kqb” means. Why?
After you publish changes to a Power BI semantic model it can take a few minutes, sometimes up to an hour, for updates to the AI Instructions to be applied. This means if you’re testing Power BI Copilot in the Service you may need to be patient if you want to see the impact of any changes to AI Instructions, or do your testing in Power BI Desktop.
How can you know whether the latest version of your AI Instructions are being used in the Service when you do your testing? In the Power BI side pane in both Desktop and the Service there is an option to download diagnostics from the “…” menu in the top right-hand corner. This downloads a text file with diagnostic data in JSON format which contains a lot of useful information; most importantly it contains the AI Instructions used for the current Copilot session. The file contents aren’t documented anywhere, I guess because the structure could change at any time and it’s primarily intended for use by support, but there’s no reason why you as a developer shouldn’t look at it and use it.
For the second example in the Service above, where Copilot returned the wrong result, here’s what I found at the end of the diagnostics file:
As you can see the changes I made to the AI Instructions before publishing the second time had not been applied when I ran the prompt asking about kqb.
After waiting a while, and without making any other changes to the model, the same prompt eventually returned the correct results in the Service:
Looking at the diagnostics file for this Copilot session it shows that the new version of the AI Instructions was now being used:
Since looking in the diagnostics file is the only way (at least that I know of right now) to tell what AI Instructions are being used at any given time, it makes sense to do what I’ve done here and put a version number at the top of the instructions so you can tell easily whether your most recent changes are in effect.
One last point to mention is that if you’re deploying semantic models using Deployment Pipelines or Git, the docs state that you need to refresh your model after a deployment for changes to AI Instructions to take effect and that for DirectQuery or Direct Lake (but not Import) mode models this only works once per day.
If there are specific ways you want your data to be visualised by Power BI Copilot then you have two options. You can use Verified Answers to link questions to visuals on your reports and can even set them up so users can apply filters to these visuals in their prompts. Alternatively – and this is the topic of this post – you can use AI Instructions to tell Copilot which visuals should be used when certain combinations of measures and columns appear together in a prompt.
Using the same semantic model I have used in this series of posts, consider the following prompts and the responses returned when there are no AI Instructions:
show count of transactions by date for the first week of January 2025
This returns results as a line chart:
show count of transactions broken down by New
[There is a column in the model called New containing Y or N values]
This returns results as a bar chart:
Adding the following AI Instructions changes the visualisations used by Copilot for these prompts:
When showing the count of transactions measure by date always use a column chart.
When showing the count of transactions measure by New use a pie chart.
Here are the responses returned now:
show count of transactions by date for the first week of January 2025
show count of transactions broken down by New
Before someone comments, I’m not saying that these visualisation choices are better than the defaults, I’m just saying that you can use AI Instructions to change the visuals used in responses. There may be a type of analysis for your data or your industry that should always use a certain Power BI visual, one that isn’t the visual that Copilot chooses by default. And we all know that some users have strong opinions on which visuals should be used that don’t follow data visualisation best practices…