Everyone is excited about Power BI Copilot, and the newly-announced preview of being able to use Copilot to ask questions about all the data in your semantic model rather than just what is shown in a report is a massive step forward. However amazing the LLMs used behind the scenes are, though, the quality of the results your users get from this new Copilot preview depends on a number of factors that you as a developer control. How you model your data is very important and as the announcement blog mentions, so is the linguistic schema that was originally added for Copilot’s predecessor feature, Q&A. Copilot returns much better results than Q&A ever did but the combination of Copilot and the extra information that Q&A’s linguistic schema provides (information Copilot could not know unless you told it) makes Copilot even more powerful. What’s more, you don’t need to edit a YAML file to use this functionality any more because most of the features of the linguistic schema are now available to edit in Power BI Desktop’s Q&A Setup dialog.
In this blog post I’ll show you a few examples of how adding to a model’s linguistic schema improves the new Power BI Copilot preview’s results when you’re querying your semantic model.
Semantic Model
Let’s say you own a farm where you grow fruit. Customers visit the farm to buy fruit and the fruit is picked for these customers by your employees. You store your sales data in a Power BI semantic model that looks like this:
The Orders table is a fact table with one row for each order line. The Order ID column identifies the order that each line item is linked to, the Amount column contains the sales value and the Units column contains the number of fruit delivered for each line. The Employee dimension gives the name of the employee who picked the fruit; the Customer dimension gives the name of the customer who ordered the fruit; the Product dimension gives the name of the fruit picked. Products are associated with Product Groups via a many-to-many relationship.
Here’s the data in the dimension tables:
To get the best results from Copilot note that:
- The data is modelled as a classic star schema.
- The table and column names are in human-readable English with no abbreviations and with spaces between the words. I talked about my opinions on Power BI naming conventions in this blog post.
- All tables and columns that should not be shown in a report have been hidden.
- The fact table measure columns have been hidden and three explicit measures – Order Amount, Order Units and Order Count (which is a distinct count on the Order ID column) – have been created.
Synonyms
While Copilot performs well on this model, let’s look at a simple question where it doesn’t return the expected result:
Show the number of orders by employee
The prompt returns a visual, but on closer inspection it’s not the result you want. It shows the count of rows in the Orders table which is the number of line items, not a count of orders:
To get the correct result you need to tell Copilot that the Order Count measure returns the number of orders by defining a Synonym. You can do this in the Q&A setup dialog in Power BI Desktop on the Synonyms tab:
Setting “number of orders” as a synonym for the Order Count measure means that the prompt now returns the following visual with the results you want:
Verbs
The next prompt to look at is:
Who picked lemons?
You know that on our farm it’s the employees who pick the fruit but there’s nothing in the model to tell Copilot that. As a result the prompt above results in Copilot saying that it doesn’t know what “picked” means in this context:
On the relationships tab of the Q&A Setup dialog you can fix this by defining a Verb relationship:
The relationship tells Copilot that “Employee names” pick “Product names” with the Orders table connecting the two columns.
With this relationship in place, Copilot correctly answers that Gabi was the only employee who picked lemons:
Nouns
The customer Chris is also widely referred to by employees as “Mr Webb”, but that name isn’t stored anywhere in the model. As a result the prompt
How much money did we make from Mr Webb?
results in the following, fairly reasonable, response:
However with a noun relationship set up to tell Copilot that “Mr Webb” is a kind of customer name where customer name equals “Chris”:
Then the result is what you would expect:
Dynamic Nouns
Copilot does a good job with the many-to-many relationship between Product and Product Group without any optimisation. For example the prompt:
show all citrus fruit and their order amounts
Returns the correct result:
But let’s say that in this case you want to show the individual products rather than the product group “citrus fruit”. You can do this by setting up a dynamic noun relationship:
The relationship is that”Product group names” define kinds of “product names” with the Product To Product Group table linking the two. With this in place the prompt now returns the desired result:
Conclusion
These examples barely scratch the surface of what’s possible with the linguistic schema and Copilot. Apart from the documentation, I found the videos on the (fairly old) “Natural Language for Power BI” YouTube channel which were created when Q&A was launched useful for understanding the concepts here too. There’s a lot to learn here but with some trial and error, as well as listening to feedback from your end users, you should be able to tune Copilot so it returns high quality results almost all the time.