How To Get The Details Of Power BI Operations Seen In The Capacity Metrics App

It’s the week of Fabcon Europe and you’re about to be overwhelmed with new Fabric feature announcements. However there is a new blink-and-you’ll-miss-it feature that appeared in the latest version of the Fabric Capacity Metrics App (released on 11th September 2025, version 47) that won’t get any fanfare but which I think is incredibly useful – it allows you to link the Power BI operations (such as queries or refreshes) you see in the Capacity Metrics App back to Workspace Monitoring, Log Analytics or Profiler so you can get details such as the query text.

Let’s say you’re in the Capacity Metrics App in the existing Timepoint Detail page. On the top right hand corner of both the “Interactive operations for time range” and “Background operations for time range” tables there is a dropdown box that allows you to display additional columns. This box now contains an option to display the OperationId column:

After you’ve added this column you’ll see it contains a GUID:

There is also, incidentally, a new page in preview called Timepoint Item Detail (preview) which is reached through the new Timepoint Summary (preview) page and which will eventually replace the Timepoint Detail page. If you haven’t seen this you should check it out: I think it’s a big improvement. This also has a dropdown box that allows you to show the OperationId column in its versions of the “Interactive operations for time range” and “Background operations for time range” tables.

This page also has a dropdown box at the top that allows you to filter operations by OperationId.

The OperationId is a unique identifier for each Power BI operation. Right clicking on this value and selecting Copy/Copy value to copy it:

…means that you can use this value to cross-reference with the log data you find in Workspace Monitoring, Log Analytics or Profiler. For example I have a workspace with Workspace Monitoring enabled and found the following OperationId in the Capacity Metrics App: a7a2d4d4-2a9b-4535-b65a-a0cc0389d821. The following KQL query run on Workspace Monitoring:

let
OperationIdFromCapacityMetrics = "a7a2d4d4-2a9b-4535-b65a-a0cc0389d821";
SemanticModelLogs
| where OperationId == OperationIdFromCapacityMetrics
| where OperationName == "QueryEnd"
| project Timestamp, ItemName, EventText, DurationMs, OperationDetailName
| order by Timestamp

…returns the query text for the DAX query associated with this interactive operation:

There’s a lot of other information you can get by writing KQL queries from Workspace Monitoring (for some examples see here) such as the IDs of the visual and the report that generated the query. If you’re using Log Analytics or Profiler there is no OperationId column – it’s called XmlaRequestId in Log Analytics and RequestId in Profiler – but the same information is available there too.

This is very useful for admins trying to identify why a capacity is overloaded: it means that you can now see the details of expensive queries or refresh operations and understand why they are causing problems. Make sure you upgrade your Capacity Metrics App to the latest version and enable Workspace Monitoring on all important workspaces so you can do this!

[Thanks to Tim Bindas and Lukasz Pawlowski for letting me know about this]

What Happens When Power BI Direct Lake Semantic Models Hit Guardrails?

Direct Lake mode in Power BI allows you to build semantic models on very large volumes of data, but because it is still an in-memory database engine there are limits on how much data it can work with. As a result it has rules – called guardrails – that it uses to check whether you are trying to build a semantic model that is too large. But what happens when you hit those guardrails? This week one of my colleagues, Gaurav Agarwal, showed me the results of some tests that he did which I thought I would share here.

Before I do that though, a bit more detail about what these guardrails are. They are documented in the table here, they vary by Fabric capacity SKU size and there are four of them which are limits on:

  • The number of Parquet files per Delta table
  • The number of rowgroups per Delta table
  • The number of rows per table
  • The total size of the data used by the semantic model on disk

There is also a limit on the amount of memory that can be used by a semantic model, something I have blogged about extensively, but technically that’s not a guardrail.

Remember also that there are two types of Direct Lake mode (documented here): the original Direct Lake mode called Direct Lake on SQL Endpoints that I will refer to as DL/SQL and which has the ability to fall back to DirectQuery mode, and the newer version called Direct Lake on OneLake that I will refer to as DL/OL and which cannot fall back to DirectQuery.

For his tests, Guarav built a Fabric Warehouse containing a single table. He then added more and more rows to this table to see how a Direct Lake semantic model built on this Warehouse behaved. Here’s what he found.

If you build a DL/SQL model that exceeds one of the guardrails for the capacity SKU that you are using then, when you refresh that model, the refresh will succeed and you will see the following warning message in the model Refresh History:

We noticed that the source Delta tables exceed the resource limits of the Premium or Fabric capacity requiring queries to fallback to DirectQuery mode. Ensure that the Delta tables do not exceed the capacity's guardrails for best query perf.

This means that even though the refresh has succeeded, because the model has exceeded one of the guardrails then it will always fall back to DirectQuery mode – with all the associated performance implications.

If your DL/SQL model exceeds one of the guardrails for the largest Fabric SKU, an F2048, then refresh will fail but you will still be able to query the model and the model will again fall back to DirectQuery mode. For his tests, Guarav loaded 52 billion rows into a table; the guardrail for the maximum number of rows in a table for an F2048 is 24 billion rows. The top-level message you get when you refresh in this case is simply:

An error occurred while processing the semantic model.

Although if you look at the details you’ll see a more helpful message:

We cannot refresh the semantic model because of a Delta table issue that causes framing to fail. The source Delta table '<oii>billionrows</oii>' has too many parquet files, which exceeds the maximum guardrails. Please optimize the Delta table. See https://go.microsoft.com/fwlink/?linkid=2316800 for guardrail details.

The DAX TableTraits() function, which another colleague, Sandeep Pawar, blogged about here, can also tell you the reason why a DL/SQL semantic model is falling back to DirectQuery mode. Running the following DAX query on the 52 billion row model:

EVALUATE TABLETRAITS()

…returned the following results:

This shows that the table called billionrows actually exceeds the guardrails for the number of files, the number of rowgroups and the number of rows.

What about DL/OL models though? Since they cannot fall back to DirectQuery mode, when you try to build or refresh a DL/OL semantic model that exceeds a guardrail you’ll get an error and you won’t be able to query your semantic model at all. For example here’s what I saw in Power BI Desktop when I tried to use the 52 billion row table in a DL/OL model:

Something went wrong connecting to this item. You can open the item in your browser to see if there is an issue or try connecting again.
We cannot refresh the semantic model because of a Delta table issue that causes framing to fail. The source Delta table 'billionrows' has too many parquet files, which exceeds the maximum guardrails. Please optimize the Delta table. 

All of this behaviour makes sense if you think about it, even though I wouldn’t have known how things work exactly until I had seen it. Some behaviour may change in the future to make it more intuitive; if that happens I will update this post.

[Thanks to Gaurav for showing me all this – check out his podcast and the India Fabric Analytics and AI user group that he helps run on LinkedIn. Thanks also to Akshai Mirchandani and Phil Seamark for their help]

Performance Testing Power BI Direct Lake Models Revisited: Ensuring Worst-Case Performance

Two years ago I wrote a detailed post on how to do performance testing for Direct Lake semantic models. In that post I talked about how important it is to run worst-case scenario tests to see how your model performs when there is no model data present in memory, and how it was possible to clear all the data held in memory by doing a full refresh of the semantic model. Recently, however, a long-awaited performance improvement for Direct Lake has been released which means a full semantic model refresh may no longer page all data out of memory – which is great, but which also makes running performance tests a bit more complicated.

First of all, what is this new improvement? It’s called Incremental Framing and you can read about it in the docs here. Basically, instead of clearing all data out of memory when you do a full refresh of a Direct Lake model, the model now checks each Delta table it uses to see whether the data in it has actually changed. If it hasn’t changed then there’s no need to clear any data from that table out of memory. Since there’s a performance overhead to loading data into memory when a query runs this means that you’re less likely to encounter this overhead, and queries (especially for models where the data in some tables changes frequently) will be faster overall. I strongly recommend you to read the entire docs page carefully though, not only because it contains a lot of other useful information, but also because you might be loading data into your lakehouses in a way that prevents this optimisation from working.

Let me show you an example of this by revisiting a demo from a session I’ve done at several user groups and conferences on Power BI model memory usage (there are several recordings of it available, such as this one). Using a Direct Lake semantic model consisting of a single large table with 20 columns containing random numbers, if I use DAX Studio’s Model Metrics feature when there is no data held in memory and with the Direct Lake Behaviour setting in DAX Studio’s Options dialog set to ResidentOnly (to stop Model Metrics from loading data from all columns into memory when it runs):

Then when you run Model Metrics the size of each column in the semantic model is negligible and the Temperature and Last Accessed for all model columns are blank:

The, if I run a query that asks for data from just one column (in this case the column called “1”) from this table like this:

EVALUATE ROW("Test", DISTINCTCOUNT('SourceData'[1]))

Then rerun Model Metrics then the size in memory for that column changes, because of course it has been loaded into memory in order to run the query:

Zooming in on the Model Metrics table columns from the previous screenshot that show the size in memory:

And here are the Temperature and Last Accessed columns from the same screenshot which are no longer blank:

Since the query had to bring the column into memory before it could run, the DAX query took around 5.3 seconds. Running the same query after that, even after using the Clear Cache button in DAX Studio, took about only 0.8 seconds because the data needed for the query was already resident in memory.

OK, so far nothing has changed in terms of behaviour. However if you do a full refresh from the Power BI UI without making any changes to the underlying Delta tables:

And then rerun the Model Metrics, nothing changes and the data is still in memory! As a result the DAX query above still only takes about 0.8 seconds.

So how do you get that worst-case performance again? As mentioned in the docs here, you now need to do a refresh of type clearValues followed by a full refresh. You can’t do a refresh of type clearValues in the Power BI UI though, so the easiest way to do is to use a Fabric notebook and Semantic Link Labs. Here’s how. First install Semantic Link Labs:

%pip install semantic-link-labs

Then use the following code in a notebook cell to do a refresh of type clearValues followed by a full refresh:

import sempy_labs as labs
WorkspaceName = "Insert Workspace Name Here"
SemanticModelName = "Insert Semantic Model Name Here"
# run a refresh of type clearValues first
labs.refresh_semantic_model(dataset=SemanticModelName, workspace=WorkspaceName, refresh_type="clearValues")
# then a refresh of type full
labs.refresh_semantic_model(dataset=SemanticModelName, workspace=WorkspaceName, refresh_type="full")

After doing this on my model, Model Metrics shows that the column called “1” that was previously in memory is no longer resident:

…and the query above once again takes 5 seconds to run.

So, as you can see, if you’re doing performance testing of a Direct Lake model you now need to make sure you do a refresh of type clearValues and a full refresh of your model before each test to ensure no data is resident in memory and get worst-case performance readings, in addition to testing performance on a cold cache and a warm cache.

Useful Community Tools And Resources For Power BI And Fabric

There are a lot of really cool free, community-developed tools and resources out there for Power BI and Fabric – so many that it’s easy to miss announcements about them. In this post I thought I’d highlight a few that came out recently and which you might want to check out.

Let’s start with the Fabric Toolbox, a collection of tools, samples, scripts and accelerators created and maintained by some of my colleagues here at Microsoft. The most widely-known tool in there is FUAM (Fabric Unified Admin Monitoring), a solution accelerator for monitoring an enterprise Power BI and Fabric implementations. It’s the successor to Rui Romano’s Power BI monitoring solution, which is now deprecated, but it’s a lot richer than that. It’s already been the subject of a Guy In A Cube video though so I hope you’ve already come across it. There are other things in the Fabric Toolbox that should be more widely known though. My fellow CAT Phil Seamark (why doesn’t he blog anymore???) has been busy: a month ago he announced a new Power BI load testing tool (video here) based on Fabric notebooks which is much easier to configure than the previous load testing tool created by the CAT team. He’s also published a sample MCP Server that, among other things, can analyse a semantic model to see whether it follows best practices. Another colleague, Justin Martin, has published tools for auditing semantic models and DAX performance tuning in the toolbox too. Finally, with the deprecation of Power BI Datamarts looming, if you choose to replace them with Direct Lake semantic models based on Fabric Warehouse (although I think 90% of the Datamarts I’ve seen can be replaced with simple Import models) then there’s a migration accelerator here.

Elsewhere, if you’re a hardcore Power BI developer you’ll already know how useful TMDL View in Power BI Desktop is. Rui Romano recently announced that there’s a new gallery of TMDL scripts where you can see what’s possible with TMDL and share your own scripts. For example, there’s a script here that creates a date dimension table from a Power Query query.

Two years ago I blogged about a tool called PBI Inspector that provides rules-based best practices testing for the Power BI visualisation layer, created by yet another Microsoft colleague, Nat van Gulck. Not only is there now a V2 of PBI Inspector, which will be renamed Fab Inspector, but two weeks ago Nat announced a VS Code extension that allows you to write, debug and run rules from VS Code.

Last of all Gerhard Brueckl recently announced V2 of Fabric Studio, an incredibly powerful VS Code extension that acts as a wrapper for the Power BI/Fabric REST APIs. It lets you browse your workspaces and their contents from VS Code and create/update/delete items among other things; Gilbert Quevauvilliers recently wrote a nice blog post showing how you can use it to download any Power BI report from the Service easily.

That’s enough for now. If there are other tools or resources that came out recently that I didn’t mention, please leave a comment!

Power BI Copilot, AI Instructions And DAX Query Templates

At the end of my last post I showed how, if you put the definition of a measure in the AI Instructions of your Power BI semantic model, Copilot can use it as a template to create new measures with slightly different logic. That got me wondering – can you do the same thing for DAX queries to solve even more complex problems? It turns out you can.

First of all, why would you want to do this? There are some problems that can be solved relatively easily by writing a DAX query but which are quite hard to solve if you’re limited to writing measures. ABC classification is one example of this. Marco and Alberto have a great article on this problem here; the problem is when you want to increase the number of buckets/clusters/groups you use or change the rules for which transactions go into which buckets and make all this dynamic for end users building reports in Power BI Desktop then your semantic model can get very complicated very quickly. In a DAX query you can solve these problems much more easily, in part because you can define a calculated column at query time and use it to do your grouping. And since Copilot can generate and run DAX queries, it can be taught how to do ABC analysis by giving it an example DAX query and some instructions telling it how the query works. Let’s see an example.

The sample semantic model I’ve been using in this series has a fact table called Transactions where each row represents a real estate property sale in England and Wales. Let’s say you want to group these rows into different groups based on the price paid for the property. With nothing in the AI Instructions for the model, the following prompt:

Do an ABC analysis on the Transactions table, grouping all transactions into three groups named A, B and C based on the Price column where the upper boundary for the first group is £250000 and the upper boundary for the second group is £700000. Filter the transactions to just 1st January 2025.

…will sometimes generate the right result but will also sometimes generate a slightly strange or even incorrect result. Here’s an example where the information in the summary is correct but the DAX query is far from ideal:

EVALUATE
  // Create a table with transactions on 1st January 2025 and assign an ABC group based on Price
  SELECTCOLUMNS(
    FILTER(
      'Transactions',
      'Transactions'[Date] = DATE(2025, 1, 1)  // Filter transactions to 1st January 2025
    ),
    "Price", 'Transactions'[Price],
    "ABC Group", 
      SWITCH(
        TRUE(),
        'Transactions'[Price] <= 250000, "A",      // Group A: Price up to £250,000
        'Transactions'[Price] <= 700000, "B",      // Group B: Price over £250,000 and up to £700,000
        "C"                                       // Group C: Price above £700,000
      )
  )
ORDER BY
  [ABC Group] ASC,
  [Price] ASC

This query returns one row per transaction rather than one row per group:

Not something you can give to an end user for sure. Here’s a DAX query that gives me exactly the result I want:

DEFINE
//Create a new calculated column at query time 
//to create the groups for ABC classification
COLUMN 'Transactions'[Group] =
//Upper boundary for price for group A
VAR AUpperBoundary = 250000
//Upper boundary for price for group B
VAR BUpperBoundary = 700000
RETURN
//Return a different letter representing a group name
//based on where the value in the Price column sits in
//the boundaries defined
SWITCH(
TRUE(),
//If the price is less than or equal to the variable AUpperBoundary
//then return the value "A"
Transactions[Price]<=AUpperBoundary, "A (<=£250,000)",
//If the price is less than or equal to the variable BUpperBoundary
//then return the value "B"
Transactions[Price]<=BUpperBoundary, "B (>£250,000 and <=£700,000)",
//Otherwise return the value "C"
"C (>£700,000)"
)
//Returns the results of the classification
EVALUATE
SUMMARIZECOLUMNS(
    'Transactions'[Group],
    //Filter by a given date
     KEEPFILTERS( TREATAS( {DATE(2025,1,1)}, 'Date'[Date] )),
    "Count Of Transactions", [Count Of Transactions]
)
ORDER BY 
    'Transactions'[Group] ASC

Here’s what this query returns in DAX Studio:

Putting this query in the semantic model’s AI Instructions with some explanatory text, like so:

The following DAX query does an ABC analysis on the Transactions table, grouping transactions into three groups called A, B and C, for the 1st January 2025.  The transactions with the lowest prices always go into group A, then subsequent letters represent higher price ranges. If the user asks for an ABC analysis use this query as a template.

DEFINE
//Create a new calculated column at query time 
//to create the groups for ABC classification
COLUMN 'Transactions'[Group] =
//Upper boundary for price for group A
VAR AUpperBoundary = 250000
//Upper boundary for price for group B
VAR BUpperBoundary = 700000
RETURN
//Return a different letter representing a group name
//based on where the value in the Price column sits in
//the boundaries defined
SWITCH(
TRUE(),
//If the price is less than or equal to the variable AUpperBoundary
//then return the value "A"
Transactions[Price]<=AUpperBoundary, "A (<=£250,000)",
//If the price is less than or equal to the variable BUpperBoundary
//then return the value "B"
Transactions[Price]<=BUpperBoundary, "B (>£250,000 and <=£700,000)",
//Otherwise return the value "C"
"C (>£700,000)"
)
//Returns the results of the classification
EVALUATE
SUMMARIZECOLUMNS(
    'Transactions'[Group],
    //Filter by a given date
     KEEPFILTERS( TREATAS( {DATE(2025,1,1)}, 'Date'[Date] )),
    "Count Of Transactions", [Count Of Transactions]
)
ORDER BY 
    'Transactions'[Group] ASC

Means that the prompt above not only returns what you would expect:

…but more importantly, variations on the prompt such as:

Do an ABC analysis on the Transactions table, grouping all transactions into four groups named A, B, C and D based on the Price column where the upper boundary for the first group is £300000, the upper boundary for the second group is £750000 and the upper boundary for the third group is £900000. Filter the transactions to just 16th January 2025.

…take the template query and modify it as you would expect:

Note that the prompt has not only changed the date to filter by but increased the number of groups from three to four, and that Copilot has modified the DAX query to handle this – which I think is very impressive. Here’s the DAX query Copilot generated:

// DAX query generated by Fabric Copilot with "Group all transactions from the Transactions table into four groups named A, B, C, and D based on the Price column, where group A includes transactions with Price up to £300,000, group B includes transactions with Price above £300,000 and up to £750,000, group C includes transactions with Price above £750,000 and up to £900,000, and group D includes transactions with Price above £900,000. Filter the transactions to just 16th January 2025. Show the count of transactions in each group."
DEFINE
// Create a new calculated column at query time 
// to classify transactions into groups A, B, C, and D based on the Price column
COLUMN 'Transactions'[Group] =
    // Define the upper boundaries for each group
    VAR AUpperBoundary = 300000
    VAR BUpperBoundary = 750000
    VAR CUpperBoundary = 900000
    RETURN
        SWITCH(
            TRUE(),
            'Transactions'[Price] <= AUpperBoundary, "A (<=£300,000)",
            'Transactions'[Price] <= BUpperBoundary, "B (>£300,000 and <=£750,000)",
            'Transactions'[Price] <= CUpperBoundary, "C (>£750,000 and <=£900,000)",
            "D (>£900,000)"
        )
        
EVALUATE
    SUMMARIZECOLUMNS(
        'Transactions'[Group],
        // Filter transactions to include only the 16th January 2025
        KEEPFILTERS( TREATAS( { DATE(2025,1,16) }, 'Date'[Date] ) ),
        "Count Of Transactions", [Count Of Transactions]
    )
ORDER BY 
    'Transactions'[Group] ASC

Tools that convert natural language to SQL, like Fabric Data Agents, all allow you to specify example SQL queries like this so it’s hardly a surprise that the same approach applies to DAX. What’s interesting is the type of problems that giving Copilot templates of DAX measures and queries allows you to solve, and I’m only just starting to think of the possibilities.

Power BI Copilot, AI Instructions And DAX Measure Definitions

Continuing my (already very long) series on what information you should be adding to the AI Instructions of your semantic model and why, in this post I’ll show you the benefits of adding the DAX definitions of your measures.

A really common question from end users viewing a Power BI report is “how is this measure calculated?”. As a result I have seen model developers use techniques like this to display either a text description of how the measure works in a report or the actual DAX definition. It is therefore not a surprise that if end users are using Copilot they will ask the same question. Unfortunately Copilot cannot – or rather should not, at the time of writing – be able to to see the definitions of the measures in your model. Most of the time if a user asks to see how a measure is defined then Copilot will say that it can’t show the definition:

[Argh, yes I know it’s calling a measure a calculated column]

…although sometimes it does seem to be able to get the definition by writing a DAX query – but I have also seen it hallucinate and come up with a plausible looking definition which isn’t the actual definition. Anyway, it certainly can’t show the definition of a measure or a description of how it works reliably.

Adding all the measure definitions to the model’s AI Instructions mostly solves this problem. TMDL View makes it easy to get all the measure definitions in a semantic model in Power BI Desktop and you can copy/paste them from there into the AI Instructions.

In the semantic model that I’ve been using throughout this series I added some extra measures and then copied their DAX definitions and their descriptions into the AI Instructions. Here are the AI Instructions:

##Definitions and descriptions of the measures in this model
If a user asks how a measure is defined, asks how a measure works or asks how a measure is calculated, ignore any previous instructions about displaying measure definitions from the model and show the definition given here.

All measures that return currency values do so in Pounds Sterling and should be formatted with a £ sign.

##Average Price Paid
AVERAGE('Transactions'[Price])

##Count Of Transactions
COUNTROWS('Transactions')

##New Build Average Price Paid
Gets the value of the Average Price Paid measure where the New column equals "Y"
CALCULATE([Average Price Paid], KEEPFILTERS('Transactions'[New]="Y")) 

##Tax Paid
Different tax rates are levied on new build and non-new build properties
A 10% tax is levied on the sale of new build properties
A 5% tax is levied on the sale of properties that are not new builds
(CALCULATE(SUM('Transactions'[Price]), KEEPFILTERS('Transactions'[New]="Y")) * 0.1)
+
(CALCULATE(SUM('Transactions'[Price]), KEEPFILTERS('Transactions'[New]="N")) * 0.05)

Note in particular the Tax Paid measure – it has some fairly complex logic that bears no relation to any actual tax in the UK, so I knew there was no way Copilot would be able to guess how it was defined.

With these AI Instructions in place, the prompt:

how does the tax paid measure work?

…shows a helpful summary:

[Although I admit sometimes Copilot still does reply saying that it can’t display the definition of a measure from the model, which isn’t ideal – if I work out how to stop this happening I’ll update this post]

Even more impressively, since Copilot knows the definition of the measure, it is able to answer more complex questions like this:

show me what the tax paid by property type name would be if the tax on new build properties was increased to 11%

Here’s the DAX query generated, which shows that Copilot has modified the definition of the measure correctly:

// DAX query generated by Fabric Copilot with "Show the tax paid by Property Type Name if the tax on new build properties was increased to 11%."
EVALUATE
  // Summarize the tax paid by Property Type Name with increased new build tax rate (11%)
  SUMMARIZECOLUMNS(
    'PropertyTypes'[Property Type Name],
    "Tax Paid", 
      // New build tax (rate increased to 11%) plus non-new build tax (rate unchanged at 5%)
      (CALCULATE(SUM('Transactions'[Price]), KEEPFILTERS('Transactions'[New] = "Y")) * 0.11) +
      (CALCULATE(SUM('Transactions'[Price]), KEEPFILTERS('Transactions'[New] = "N")) * 0.05)
  )
  ORDER BY
    'PropertyTypes'[Property Type Name] ASC

Why Do Power BI Copilot AI Instructions Seem To Work Better in Desktop Than In The Service?

I’m spending a lot of time with organisations testing Power BI Copilot at the moment, and something I hear a lot is that Copilot works well in Desktop but when you publish your model to the Power BI Service the results are a lot more inconsistent. One reason why is because of how updates to AI Instructions are applied after you publish your semantic model.

Let’s see an example of this. Consider the following semantic model consisting of a single table with two measures, Sales Amount and Profit Amount:

The semantic model has the following AI Instructions applied:

//Instructions V1
xyz means the measure Sales Amount

The instructions here don’t make much sense, but using a meaningless term like “xyz” makes it easier to test whether Copilot is using an instruction or not.

In Power BI Desktop, the following Copilot prompt returns the results you’d expect with xyz understood as Sales Amount:

show xyz

If you publish this model to an empty workspace in the Power BI Service then this prompt returns the same correct result.

[By the way, the message “Copilot is currently syncing with the data model. Results may be inconsistent until the sync is finished” will be the subject of a future blog post – it’s not connected to what I’m describing in this post, it relates to how Copilot needs to index the text values in your semantic model, which is a separate process]

So far so good. Going back to Power BI Desktop, changing the AI Instructions like so:

//Instructions V2
xyz means the measure Sales Amount
kqb means the measure Profit Amount

…then closing and reopening the Copilot pane in Desktop and entering the prompt:

show kqb

…also returns the result you would expect, with kqb understood as Profit Amount

However, if you publish the same model up to the same workspace as before – so you are overwriting the previous version of the model in the Service – and then use the same prompt immediately after publishing:

…Copilot returns an incorrect result: it does not understand what “kqb” means. Why?

After you publish changes to a Power BI semantic model it can take a few minutes, sometimes up to an hour, for updates to the AI Instructions to be applied. This means if you’re testing Power BI Copilot in the Service you may need to be patient if you want to see the impact of any changes to AI Instructions, or do your testing in Power BI Desktop.

How can you know whether the latest version of your AI Instructions are being used in the Service when you do your testing? In the Power BI side pane in both Desktop and the Service there is an option to download diagnostics from the “…” menu in the top right-hand corner. This downloads a text file with diagnostic data in JSON format which contains a lot of useful information; most importantly it contains the AI Instructions used for the current Copilot session. The file contents aren’t documented anywhere, I guess because the structure could change at any time and it’s primarily intended for use by support, but there’s no reason why you as a developer shouldn’t look at it and use it.

For the second example in the Service above, where Copilot returned the wrong result, here’s what I found at the end of the diagnostics file:

As you can see the changes I made to the AI Instructions before publishing the second time had not been applied when I ran the prompt asking about kqb.

After waiting a while, and without making any other changes to the model, the same prompt eventually returned the correct results in the Service:

Looking at the diagnostics file for this Copilot session it shows that the new version of the AI Instructions was now being used:

Since looking in the diagnostics file is the only way (at least that I know of right now) to tell what AI Instructions are being used at any given time, it makes sense to do what I’ve done here and put a version number at the top of the instructions so you can tell easily whether your most recent changes are in effect.

One last point to mention is that if you’re deploying semantic models using Deployment Pipelines or Git, the docs state that you need to refresh your model after a deployment for changes to AI Instructions to take effect and that for DirectQuery or Direct Lake (but not Import) mode models this only works once per day.

Power BI Copilot, AI Instructions And Visualisation Guidelines

If there are specific ways you want your data to be visualised by Power BI Copilot then you have two options. You can use Verified Answers to link questions to visuals on your reports and can even set them up so users can apply filters to these visuals in their prompts. Alternatively – and this is the topic of this post – you can use AI Instructions to tell Copilot which visuals should be used when certain combinations of measures and columns appear together in a prompt.

Using the same semantic model I have used in this series of posts, consider the following prompts and the responses returned when there are no AI Instructions:

show count of transactions by date for the first week of January 2025

This returns results as a line chart:

show count of transactions broken down by New

[There is a column in the model called New containing Y or N values]

This returns results as a bar chart:

Adding the following AI Instructions changes the visualisations used by Copilot for these prompts:

When showing the count of transactions measure by date always use a column chart.
When showing the count of transactions measure by New use a pie chart.

Here are the responses returned now:

show count of transactions by date for the first week of January 2025
show count of transactions broken down by New

Before someone comments, I’m not saying that these visualisation choices are better than the defaults, I’m just saying that you can use AI Instructions to change the visuals used in responses. There may be a type of analysis for your data or your industry that should always use a certain Power BI visual, one that isn’t the visual that Copilot chooses by default. And we all know that some users have strong opinions on which visuals should be used that don’t follow data visualisation best practices…

Power BI Copilot, AI Instructions And Semantic Model Relationships

Power BI Copilot knows about the relationship between measures and tables, which means that it’s very good at knowing which measures can be broken down by which columns. For example, consider the following model with two fact tables, one of which contains sales data dimensioned by Employee and Customer, and one which contains target data that is only dimensioned by Employee:

For extra fun, I’ve created a disconnected measure table to hold all the measures, which are defined as follows:

Sales Amount = SUM(Sales[Sales])
Target Amount = SUM(Targets[Target])
Target % = DIVIDE([Sales Amount],[Target Amount])

Without any AI Instructions added to the model, for the following prompt:

Show Sales Amount broken down by Customer

I get (as you would expect) the following correct response:

However for the prompt:

Show Target Amount broken down by Customer

…I get a response that you can’t show Target Amount broken down by Customer because there’s no direct relationship. This is, I think the right response for most users.

Note that Copilot knows the relationships between the Sales Amount and Target Amount measures and the Customer dimension table even though they are on the disconnected measures table, which I think is very smart.

Even better, a prompt like this:

show a table with Employee Name, Customer Name, Sales Amount, Target Amount and Target %

Returns the result I would expect:

Copilot has understood that even though there’s no relationship between Customer and Target Amount, it makes sense to break it down in this case.

What about calculation groups? I’ve seen some inconsistent behaviour over the last few days but on the day I’m writing this post, calculation groups were working well in Copilot with no special AI Instructions despite the fact they have no relationship with any other table.

Adding a calculation group to the model like so with two calculation items, Actual Value and Forecast:

Actual Value = SELECTEDMEASURE()
Forecast = SELECTEDMEASURE() * 100

And prompting:

Show Forecast Sales Amount broken down by Customer

…gives the right response using the Forecast calculation item:

That said I’m sure there will be scenarios where Copilot decides it can’t use a measure with a particular column even though you want it to, and you can override this behaviour with AI Instructions. However, I have found you need to be very careful about what instructions you provide to get the output you expect. For example adding the following to the AI Instructions of my model:

For the Target Amount measure, ignore any previous instructions regarding not displaying measures with columns from tables where there is no direct relationship. 

…means that the prompt shown above which asks for Target Amount broken down by Customer:

Show Target Amount broken down by Customer

…returns the following result:

This surprised me a lot: I was expecting to see the same value repeated across all customers. I worked out that the reason different customers showed different values for Target Amount was that the visual was assuming an indirect relationship via the Sales table, but even then I had no idea that it was possible to build a Power BI visual that did this when no many-to-many relationships exist in the model – I certainly couldn’t build the same visual myself manually. It’s one way of answering the question correctly though.

Altering the AI Instructions to tell Copilot not to do this was tricky but eventually I came up with this:

For the Target Amount measure, ignore any previous instructions regarding not displaying measures with columns from tables where there is no direct relationship. When displaying Target Amount by Customer do not assume there is an indirect relationship via the Sales table.

The same prompt then gave me the result I was expecting with the same Target Amount value repeating across all customers (and it’s interesting it decided it had to create a DAX query to do this too):

This tells me that it’s probably not a good idea to try to override Copilot’s default behaviour when it comes to relationships, tables and measures.

Power BI Copilot, AI Instructions And Dealing With Ambiguity

One of the most common questions I hear about Power BI Copilot is how you can stop it from guessing what a user means when they ask an ambiguous question, and instead get it to ask for clarification. This is an interesting problem because Copilot already does this and what you really want is a way to control the level of tolerance for ambiguity. What’s more, if Copilot guesses what the user means correctly you’re probably not going to complain or even notice; it’s only when it guesses incorrectly that you’re going to wish it had asked what the user meant.

Using the semantic model containing UK real estate sales data that I’ve used throughout this series of posts, with no AI Instructions added to the model, consider the following prompt:

where is the nicest place to live in England?

It’s a great example of a question that could potentially be answered from the data but only with more information on what the user means by “nicest”. I think Copilot comes back with a very good response here:

As you can see, Copilot doesn’t know what the user means by “nicest” and asks the user what criteria they want to use to determine whether a place is “nice”.

What about an example of where Copilot does make assumptions about what the user means? Take the following prompt:

show how the number of sales varied recently

This time it does come back with an answer. I think this is a good, useful response but I want you to notice that Copilot has made two assumptions here:

  1. It has interpreted “number of sales” as meaning the measure Count of Transactions
  2. It has taken the term “recently” and decided to show date-level data for the latest available month of data, April

Can this behaviour be changed? Yes, but it’s not an exact science. For example adding the following to the AI Instructions of the model:

If you don't understand what the user is asking for never, ever guess - you must always ask for clarification

Here’s what Copilot now responds to the second prompt above:

As you can see, it’s now asking what is meant by “recently”. Clarifying this as follows:

"recently" means the days in April

Gives the same result as without the AI Instructions:

BUT – even though Copilot asked what “recently” meant, it still went ahead and assumed that “number of sales” meant the Count of Transactions measure. Adding to the AI Instructions to make it clear that Copilot should always ask if there’s any doubt about which measure to use like so:

If you don't understand what the user is asking for never, ever guess - you must always ask for clarification. In particular if the user does not explicitly mention the exact name of the measure they want to use, or the name of the measure is in any way ambiguous, do not return a result. Instead ask the user which measure they want to use. Be extremely cautious about which measure you use.

…results in a response that asks for clarification not only about what “recently” means but also what “number of sales” means:

Copilot doesn’t do this reliably though: even though it always seems to ask about “recently” now it only sometimes asks for clarification about “number of sales”. Sometimes is better than never though; indeed it’s the kind of uncertainty that is expected with Generative AI. I think I need to do some more research into what’s going on here. At least this shows that AI Instructions can be used to make Copilot more cautious around ambiguous questions and more likely to ask for clarification.

[Update 17th July 2025]

After testing this some more (and possibly after an update to the Power BI Service in the last few days) I have come up with some AI Instructions that seem to be a lot more reliable when it comes to asking the user which measure they want to use and to define what “recently” means:

If you don't understand what the user is asking for never, ever guess - you must always ask for clarification. If there are multiple points you don't understand it is essential that you ask the user to clarify all of them.

In particular, ignore all previous instructions regarding which measure to select and make sure you obey the following rule: if the user does not explicitly mention the exact name of the  measure they want to use, or the name of the measure is in any way ambiguous, do not return a result. Instead ask the user which measure they want to use. Be extremely cautious about which measure you use! At the same time, as I said, remember to clarify other, non-measure related ambiguities.