Why Do Power BI Copilot AI Instructions Seem To Work Better in Desktop Than In The Service?

I’m spending a lot of time with organisations testing Power BI Copilot at the moment, and something I hear a lot is that Copilot works well in Desktop but when you publish your model to the Power BI Service the results are a lot more inconsistent. One reason why is because of how updates to AI Instructions are applied after you publish your semantic model.

Let’s see an example of this. Consider the following semantic model consisting of a single table with two measures, Sales Amount and Profit Amount:

The semantic model has the following AI Instructions applied:

//Instructions V1
xyz means the measure Sales Amount

The instructions here don’t make much sense, but using a meaningless term like “xyz” makes it easier to test whether Copilot is using an instruction or not.

In Power BI Desktop, the following Copilot prompt returns the results you’d expect with xyz understood as Sales Amount:

show xyz

If you publish this model to an empty workspace in the Power BI Service then this prompt returns the same correct result.

[By the way, the message “Copilot is currently syncing with the data model. Results may be inconsistent until the sync is finished” will be the subject of a future blog post – it’s not connected to what I’m describing in this post, it relates to how Copilot needs to index the text values in your semantic model, which is a separate process]

So far so good. Going back to Power BI Desktop, changing the AI Instructions like so:

//Instructions V2
xyz means the measure Sales Amount
kqb means the measure Profit Amount

…then closing and reopening the Copilot pane in Desktop and entering the prompt:

show kqb

…also returns the result you would expect, with kqb understood as Profit Amount

However, if you publish the same model up to the same workspace as before – so you are overwriting the previous version of the model in the Service – and then use the same prompt immediately after publishing:

…Copilot returns an incorrect result: it does not understand what “kqb” means. Why?

After you publish changes to a Power BI semantic model it can take a few minutes, sometimes up to an hour, for updates to the AI Instructions to be applied. This means if you’re testing Power BI Copilot in the Service you may need to be patient if you want to see the impact of any changes to AI Instructions, or do your testing in Power BI Desktop.

How can you know whether the latest version of your AI Instructions are being used in the Service when you do your testing? In the Power BI side pane in both Desktop and the Service there is an option to download diagnostics from the “…” menu in the top right-hand corner. This downloads a text file with diagnostic data in JSON format which contains a lot of useful information; most importantly it contains the AI Instructions used for the current Copilot session. The file contents aren’t documented anywhere, I guess because the structure could change at any time and it’s primarily intended for use by support, but there’s no reason why you as a developer shouldn’t look at it and use it.

For the second example in the Service above, where Copilot returned the wrong result, here’s what I found at the end of the diagnostics file:

As you can see the changes I made to the AI Instructions before publishing the second time had not been applied when I ran the prompt asking about kqb.

After waiting a while, and without making any other changes to the model, the same prompt eventually returned the correct results in the Service:

Looking at the diagnostics file for this Copilot session it shows that the new version of the AI Instructions was now being used:

Since looking in the diagnostics file is the only way (at least that I know of right now) to tell what AI Instructions are being used at any given time, it makes sense to do what I’ve done here and put a version number at the top of the instructions so you can tell easily whether your most recent changes are in effect.

One last point to mention is that if you’re deploying semantic models using Deployment Pipelines or Git, the docs state that you need to refresh your model after a deployment for changes to AI Instructions to take effect and that for DirectQuery or Direct Lake (but not Import) mode models this only works once per day.

4 thoughts on “Why Do Power BI Copilot AI Instructions Seem To Work Better in Desktop Than In The Service?

  1. I have been following the series on Copilot and AI instructions, and I find them very useful. I would like to ask if the functionality tested is available only under Fabric Capacity only? In our company we still have PBI Premium capacity and I tried some of your advice and tests on one of our semantic models /reports that we build with all the recommendations for AI ready model, including optimal naming convention, synonyms definitions and revised linguistic schema. While Q&A more or less works to expectations, the AI instructions are completely ignored.
    I am wondering if this is because of the Premium Capacity limitation or the overengineered linguistic schema taking over the priority above the instructions?
    I noticed that once I deleted by mistake one of the fields used in the linguistic schema relationship definitions (making the linguistic schema blocked due to the errors in relationships), I found some of my instructions working.
    Then I corrected the missing field and the l. schema, and since then all of the AI instructions are completely overlooked or even the answers are most of the time the opposite of the instructions, which is really annoying.
    As an example the following instruction worked only while the linguistic schema had an error in relationship due to removed field. “When you are asked to provide KPI results and If the user does not specify a calendar or fiscal period, do not return any KPI results. Always ask them to specify the period of the query first!”. Period and fiscal and calendar are names used in the model…
    the instruction was followed until I corrected the missing relationship field in the ling. schema. Now this instruction never ever works. I have also some synonyms in the l. schema related to periods (like last quarter synonym etc.) so maybe they might block the related instructions somehow?
    Now I do not know id this is because of the Premium capacity limitation (not fully supporting the AI instructions, if true why I could see them working for a while) or there is some (not working) logic related to Linguistic schema overriding instructions(but why all of the instructions disrespected)?

    1. AI Instructions work the same on P SKUs and F SKUs, there is no difference. It sounds like something might be corrupted though. If you create a new model with the same data does it work as expected?

  2. Can you at some point also cover the cost control of CoPilot? We are having a really hard time identifying the CUs for the CoPilot workloads in the Capacity Metrics app. Will we be able to distinguish between Desktop and service consumptions?

Leave a Reply