21st Blog Birthday: Centralised And Decentralised BI And AI

As the saying goes, history doesn’t repeat itself but it rhymes. While 2025 has seen the excitement around new advances in AI continue to grow in data and analytics as much as anywhere else, it’s also seen the re-emergence of old debates. In particular, one question has raised its head yet again: should there be a single, central place for your data to live and your security, semantic models, metrics and reports to be defined, or should you take a more distributed approach and delegate some of the responsibility for managing your data and defining and maintaining those models, metrics and reports to the wider business?

At first the answer seems obvious: centralisation is best. If you want a single version of the truth then all your data, all your semantic models and all your metrics should be centralised. Anything else leads to inconsistency between reports, inefficiencies, security threats and compliance issues. But while this is a noble ideal and is very appealing to central data teams building an empire I think history has already proved that this approach doesn’t really work. If it did, Microstrategy and Business Objects would have solved enterprise BI twenty years ago and all companies would have a long-established, lovingly curated central semantic model, sitting on an equally long-established, lovingly curated central data warehouse, that all business users love to use. That’s not the case though and there’s a reason why the self-service revolution of Tableau, Qlik and ultimately Power BI happened: old style centralised BI owned by a centralised data team solved many problems (notably the problems of the centralised data team) but not all, and most importantly not all those of the business users. I’m not saying that those older tools were bad or that centralised BI was a total failure, far from it; at best they provided an important set of quality-controlled core reports and at worst they were a convenient place for users to go to export data to Excel. But no-one can deny that those older tools died away for a reason and I feel like some modern data platforms are repeating the same mistake.

In contrast the Power BI approach – and now the approach of Fabric – of empowering business users within an environment where what they are doing can be monitored, governed and guided might seem dangerous but at the end of the day it’s more successful because it is grounded in the reality of how people use data. You can still manage your most important data and reports centrally but you have to accept that a lot, in fact most of the work that happens with data happens away from the centre. “Discipline at the core, flexibility at the edge” as my boss likes to say. This is as much a question of data culture as it is the technology that you use, but Power BI and Fabric support this approach by offering some tools that are easy to use for people whose day job might not be data and by being cheap enough to be enabled for a mass audience of users, while also providing other tools that appeal to the data professionals.

Central data teams sometimes think of their business users as children, and as a parent if you saw your six year-old pick up a bottle of vodka and try to take a swig you’d snatch it out of their hands in the same way that some data teams try to limit access to data and the tools to use with it. Business users aren’t children though, or if they are they are more like my pretty-much grown up children, and you can’t take that bottle of vodka away from them. If you do they’ll just go to the shops, buy another one and drink it out of your sight. Instead you can make sure they are aware of the dangers of alcohol, you can set an example of responsible consumption, you can educate them on how to make sophisticated cocktails as an alternative to drinking the cheap stuff neat. And while, inevitably, they will still make mistakes (think of that spaghetti Power BI model that takes four hours to refresh and two minutes to display a page as the equivalent of a teenage hangover) and some may go off the rails completely, as an approach it’s more likely to be successful overall than total prohibition in my experience.

This is an old argument and one you’ve heard before I’m sure. Why am I talking about it again? Well apart from the fact that, as I mentioned, some vendors are selling the centralise-everything dream once more, I think we’re on the verge of another self-service BI revolution that’s going to be even bigger than the one that happened fifteen or so years ago and maybe as big as the one that happened when Excel landed on desktop PCs forty years ago, a revolution driven by AI. Whether I like it or not or whether it will lead to better decisions or not is irrelevant, it’s coming. Developers whose opinion I trust like Jeffrey Wang are already saying how it’s transforming their work. More importantly I’ve tried it, it let me do stuff I couldn’t do before and even if the quality was not great it did what I needed, and most of all it was fun. Once business users whose job it is to crunch data get their hands on these tools (when the tools are ready – I don’t think they are quite yet), understand what they can do, and start having fun themselves it will be impossible to stop them. An agent grabbing a chunk of data from your centralised, secure data catalog and then taking it away to who-knows-where to do who-knows-what will be the new version of exporting to Excel. Already a lot of the BI managers I talk to are aware of the extent that their business users are feeding data into ChatGPT on their personal devices to get their work done, even if company rules tell them not to. We need to accept that business users will want to use AI tools and provide a flexible, safe, governed way for these new ways of working with data to occur.

No data platform is ready for this future yet because no-one knows exactly what that future will look like. I can imagine that some things will be familiar: there will probably still be reports as well as natural language conversations and there will probably still be semantic models behind the scenes somewhere. How those reports and semantic models get built and who (or what) does the building remains to be seen. The only thing I am sure of is that business users will have more powerful tools available to them, that they will use these tools and they will get access to the data they need to use with these tools.

Diagnosing Power BI DirectQuery Connection Limit Problems With Performance Analyzer

To kick off my series on diagnosing Power BI performance problems with Performance Analyzer in the browser (which I introduced last week with my post on vide-coding a custom visual to visualise Performance Analyzer data), I want to revisit a subject I blogged about two years ago: how hitting the limit on the maximum number of connections to a DirectQuery data source can lead to queries queuing for an available connection and performance problems. In my original post on this topic I showed how you can use the Execution Metrics event in Profiler/Log Analytics/Workspace Monitoring to see when this queuing happens. In this post I will show how you can do exactly the same thing with Performance Analyzer.

Here’s the semantic model I used in my previous post: it has three tables in DirectQuery mode connected to SQL Server. Each table consists of a single row and column and is bound to a SQL query that takes 10 seconds to run (using the TSQL custom function I blogged about here).

Here’s the report connected to this model, containing three cards, each of which display the single value returned by each of these three tables. As you would expect, the DAX queries associated with each of these card visuals takes 10 seconds to run when run in isolation.

With the Max Connections Per Data Source property set to the default value of 10:

…I ran the report in the browser with Performance Analyzer running. Here’s what I saw in the Performance Analyzer pane:

No surprises: the DirectQuery timings are all around 10 seconds. I exported the Performance Analyzer data and loaded it into my custom visual. The events for the three card visuals were all very similar:

I then set the Max Connections Per Data Source property on the semantic model to 1, so there was only one connection available back to SQL Server, and reran the report with Performance Analyzer running. Here’s what Performance Analyzer showed in the browser this time:

The fact that the DirectQuery activity for Table C took 13 seconds, the DirectQuery activity for Table B took 24 seconds and the DirectQuery activity for Table A took 35 seconds suggests that there’s some queuing happening but there’s nothing here that tells you that for sure. But exporting the data from Performance Analyzer and loading it into my visual showed the following for Table C:

Table B:

And Table A:

Note how for Table C the bar for the Get Source Connection event is very small, but for Table B it’s around 12 seconds and for Table A it’s around 24 seconds. This tells you exactly what the problem was: queuing for a connection.

As I said, you can get the same information from the Execution Metrics event but installing Profiler or capturing this data with Log Analytics or Workspace Monitoring isn’t always an option; this is a lot more convenient.

Visualising Power BI Performance Analyzer Data With A Vibe-Coded Custom Visual

Performance Analyzer is now available in the browser, not just in Power BI Desktop! Actually everyone got excited about this back in September when it was announced and then forgot about it because it didn’t appear immediately, but now if you are in Edit mode for a published report you can see Performance Analyzer is there in the View menu. Why should you care though? Personally, while I’ve used Performance Analyzer in Desktop often enough over the years to capture DAX queries for performance tuning but I’ve always used DAX Studio, Profiler and Workspace Monitoring/Log Analytics for most of my performance tuning work. In part this is because Performance Analyzer was only available in Desktop – and the performance and behaviour of Power BI reports can be substantially different in Desktop compared to when they are published, and published reports are what your users care about. Now that Performance Analyzer is available for published reports I thought it was time to take another look at it, specifically at the detailed information it gives you when you export its data to json (which is documented here), and write a series of posts on when it can be useful for troubleshooting performance issues.

It’s very easy to use Power Query to load this export data into Excel or Power BI for analysis, so easy it wasn’t ever worth writing a blog post about it. Visualising this data is another matter because none of the Power BI native visuals are suited to the problem and indeed I was never able to find a custom visual that did the job satisfactorily either; but visualising this data is essential to understanding it properly because of the parent/child relationships between events. I really needed to build my own custom visual or use a tool like Deneb to do the job, but I didn’t have the time or skills to do so. However, a few months ago Phil Seamark showed me how to use GitHub Copilot to create custom visuals (see his blog here) and after a few hours of playing around I had something I was happy with.

You can download a sample pbix file with a Power Query query that extracts the data from a Performance Analyzer json export file and visualises it with my custom visual here. I’m not going to publish the code for any of this officially (at least not yet) because it’s still very much a rough draft; as I write more posts in this series I’ll know I’ll need to fix bugs and add functionality to the M code and the visual so things will change a lot. There are also some quirks in the data I need to understand better. Why am I writing this post if I’m not sharing the code, you may ask? I want to explain how I’m visualising Performance Analyzer data when I show screenshots of me doing so in future posts in this series. The point of the series will be to troubleshoot performance problems; the fact I’m doing so using a hacky, vibe-coded custom visual is important for context but it’s not the main aim. It’s also a great example of how AI enables a completely new type of workflow and allows someone like me to do stuff I couldn’t do before.

To illustrate what I’ve got so far, I built a report from an Import mode model with a slicer, a line chart, a scatter plot and an Azure Map visual then recorded events in Performance Analyzer in the browser when I changed the slicer selection:

After exporting the Performance Analyzer data to json and importing that data into my pbix file, here’s what my custom visual showed:

The whole interaction took around 0.7 seconds; scrolling down shows the events for each visual grouped together with the first event for a visual being its Visual Container Lifecycle event and other events relating to things like rendering and queries being run displayed underneath. A grey horizontal line marks the end of a visual’s events. I noticed that in this case some of the events associated with a visual seem to take place after the associated Visual Container Lifecycle event has finished and I think this is because these are events that are executed on different physical machines which may have clocks that are slightly out of sync – something that is called out in the docs.

I also used tooltips in the custom visual to display information for specific events like the DAX query:

That’s enough for now. In the next posts in this series I’ll show an example of how you can use Performance Analyzer and this visual to help troubleshoot specific problems.

Power BI Copilot And Report Filters And Slicers

In my last post I talked about how to push Power BI Copilot to get answers from the semantic model rather than the report you’re looking at. If you want to do this you are probably getting worse answers when Copilot goes to the report than the semantic model; before you try to bypass the report, though, it’s worth spending some time tuning how Copilot works with reports and to do that you need to understand how it works. In this post I will describe one important aspect of this that I’ve recently learned about: how Copilot behaves when filters and slicers are present on a report.

Using the same semantic model I’ve just in all my recent posts on Copilot, I created a report with a single card visual on showing the value of a measure called Count of Transactions with no other visuals or filters:

Using the prompt:

What is the value of Count of Transactions?

Gives the same value shown in the card, as you would expect:

The fact that the result comes in text form and the presence of a citation (the [1] at the end of the response which, when you click it, spotlights the card visual) tells me that Copilot answered this question using data from the report. Changing the prompt to filter by a County, like so:

What is the value of Count of Transactions for the county Devon?

…now gives me a result in the form of a visual:

This indicates that the result came from the semantic model because it could not be derived from the report.

What if the County field is added to the report as a slicer like so?

The second prompt above now gives the same answer but in a different way:

This time the textual answer and the presence of a citation shows that Copilot derived the response from the report. Clicking on the citation now not only spotlights the card visual but also shows that Copilot selected the county Devon in the slicer to get that result:

Also when you click on the citation in this response in the Service (but not in Desktop) a message is displayed at the top of the report telling the user “Copilot filters temporarily applied”:

The same thing happens if there is no slicer but if basic page and report level filters (but not visual level filters – the docs explicitly call out that that this is not supported and it looks like there is a bug here at the moment that results in incorrect results) are present. Here’s the report edited to remove the slicer and replace it with a page-level filter:

And here’s what the second prompt above returns for this new version of the report, and the citation is clicked to spotlight the card visual so it shows the result:

What’s more, editing the report so the filter is an Advanced Filter on the first letter of the name of the County like so:

…means that prompts like this:

Show the Count of Transactions for counties whose name begins with the letter H

…can be answered from the report too. Here’s the response to the prompt above with the citation clicked, the card spotlit and the new filter applied by Copilot shown:

I’m sure I’ve seen all this happen a hundred times but it’s only now that I’ve done these tests that I understand this behaviour, and now I understand it I can use it to design reports that work better with Copilot and troubleshoot problems.

[Thanks to Carly Newsome for telling me about this]

Stopping Power BI Copilot From Answering Questions From Report Visuals

When you ask Power BI Copilot a data question, the first thing it will do is try to answer that question using information from report visuals; if it can’t find the answer on a report page it will then go on try to build a new visual or generate a DAX query. Most of the time you’ll find that answering the question from data already displayed on a report is the method that is most likely to give you the correct answer, but occasionally – depending on the report, the measure and the filters applied to the visual – it can result in incorrect answers. In those situations you can use AI Instructions to influence how Power BI Copilot answers questions.

Consider the following report built on the semantic model I have used for most of my recent posts on Copilot containing real estate price data from the UK Land Registry:

There are two measures displayed in the visuals here: Count Of Transactions and Average Price Paid. Asking questions whose answers are clearly displayed on the page such as:

What is the Count Of Transactions?
What is the Average Price Paid for Flats?

…means that Copilot gets those answers from the report, as you would expect:

You can tell that the question has been answered from a report visual by the presence of Citations (underlined in red in the screenshot above) in the answers which point back to the visual used.

In this case the answers are both correct but let’s pretend that the answer to the question about the Average Price Paid for Flats is not and you want Copilot to bypass the bar chart visual when generating its answer. In this case you can use an AI Instruction like this:

If a user asks a question about the Average Price Paid measure, ignore any visuals on report pages and do not use them to answer the question because they may give a misleading answer. Instead, always generate a new visual to answer the question.

After applying these AI Instructions, the question:

What is the Average Price Paid for Flats?

…is now answered with a new card visual:

In this case Copilot has now ignored the visual on the page containing the answer and instead gone to the semantic model.

While this is a useful trick to know, if you find Copilot is not giving you the results you expect when it answers questions using report content it’s much better to try to understand why that’s happening and tune your report appropriately before trying to bypass the report altogether. How you do that is a topic I will address in a future post.

[Thanks to Celia Bayliss for the information in this post]

A Look At The Impact Of Calendar Based Time Intelligence On Power BI DirectQuery Performance

Calendar-based time intelligence (see here for the announcement and here for Marco and Alberto’s more in-depth article) is at least the second-most exciting thing to happen in DAX in the last few months: it makes many types of time intelligence calculation much easier to implement. But as far as I know only Reid Havens, in this video, has mentioned the performance impact of using this new feature and that was for Import mode. So I wondered: do these benefits also apply to DirectQuery mode? The answer is on balance yes but it’s not clear-cut.

To illustrate what I mean, I built a simple DirectQuery model against the Adventure Works DW sample database in SQL Server:

This model used the old “Mark as date table” time intelligence.

Here are the definitions of some of the measures:

Sales Amount =
SUM ( 'Internet Sales'[SalesAmount] )

YTD Sales Amount =
CALCULATE ( [Sales Amount], DATESYTD ( 'Date'[FullDateAlternateKey] ) )

PY YTD Sales Amount =
CALCULATE (
    [YTD Sales Amount],
    SAMEPERIODLASTYEAR ( 'Date'[FullDateAlternateKey] )
)

The Sales Amount measure returns the sum of the values in the SalesAmount column; the YTD Sales Amount finds the year-to-date sum of Sales Amount; and PY YTD Sales Amount finds the value of this measure in the same period of the previous year.

I then created a matrix visual showing the PY YTD Sales Amount measure with EnglishProductName from the Product dimension on columns and CalendarYear and EnglishMonthName from the Date dimension on rows:

I copied the DAX query for this visual from Performance Analyzer, pasted it into DAX Studio and then ran it on a cold cache with Server Timings enabled. Here’s what Server Timings showed:

A total duration of 1.9 seconds and 5 SE queries doesn’t look too bad. But here are the Execution Metrics for this query with some important metrics highlighted:

{ 
"timeStart": "2025-11-29T17:17:16.461Z", 
"timeEnd": "2025-11-29T17:17:18.350Z",  
"durationMs": 1890, 
"datasourceConnectionThrottleTimeMs": 0, 
"directQueryConnectionTimeMs": 24, 
"directQueryIterationTimeMs": 166, 
"directQueryTotalTimeMs": 1681, 
"externalQueryExecutionTimeMs": 1493, 
"queryProcessingCpuTimeMs": 16, 
"totalCpuTimeMs": 828, 
"executionDelayMs": 3,  
"approximatePeakMemConsumptionKB": 20977,  
"directQueryTimeoutMs": 3599000, 
"tabularConnectionTimeoutMs": 3600000,  
"commandType": "Statement", 
"queryDialect": 3, 
"queryResultRows": 1613, 
"directQueryRequestCount": 5, 
"directQueryTotalRows": 33756 
}

The important thing to notice is that while the DAX query returns 1613 rows (see the queryResultRows metric) the SQL queries generated for that DAX query return 33756 rows between them (see the directQueryTotalRows metric). Why the big difference? This is because to do the year-to-date calculation using the old time intelligence functionality, Power BI has to run a query at the date granularity, which explains why there are so many more rows returned by the SQL queries. For example here’s a snippet of the last SQL query generated:

Yuck. What’s more, bringing this number of rows from the source can be time-consuming and even after these rows have made it to Power BI, they need to be iterated over (see the directQueryIterationTimeMs metric of 166ms) and aggregated up to get the final result of the calculation. This requires memory (see the approximatePeakMemConsumptionKB metric of 20977KB) and CPU (see the totalCpuTimeMs metric of 828ms) as well as adding to the overall duration of the DAX query.

I then created a copy of this model and set up a calendar using the new calendar-based time intelligence feature like so:

I then modified the measures above to use this new calendar:

Sales Amount =
SUM ( 'Internet Sales'[SalesAmount] )

YTD Sales Amount =
CALCULATE ( [Sales Amount], DATESYTD ( 'Gregorian' ) )

PY YTD Sales Amount =
CALCULATE ( [YTD Sales Amount], SAMEPERIODLASTYEAR ( 'Gregorian' ) )

I then reran the same DAX query from my matrix visual in DAX Studio for this model. Here are the Server Timings:

The good news is that the query is now much faster: 0.5 seconds instead of 1.9 seconds. But there are more SE queries! I’m told this is because some fusion optimisations (this presentation by Phil Seamark is an excellent introduction to this subject) haven’t yet been implemented for the new calendar-based time intelligence functionality yet, which means more SQL queries are generated than you might expect. Indeed some of the SQL queries run are identical. And since there is a limit on the number of connections that Power BI can use to run SQL queries in DirectQuery mode, and since you can run into performance problems when you hit those limits (see here for more details), then more SQL queries can be a bad thing – especially when there are many visuals on a page or a lot of concurrent users using the same semantic model.

However there is more good news if you look closely. Here are the Execution Metrics for this second run:

{ 
"timeStart": "2025-11-29T17:34:56.223Z", 
"timeEnd": "2025-11-29T17:34:56.754Z",  
"durationMs": 531, 
"datasourceConnectionThrottleTimeMs": 0, 
"directQueryConnectionTimeMs": 41, 
"directQueryIterationTimeMs": 38, 
"directQueryTotalTimeMs": 465, 
"externalQueryExecutionTimeMs": 410, 
"queryProcessingCpuTimeMs": 16, 
"totalCpuTimeMs": 141, 
"executionDelayMs": 0,  
"approximatePeakMemConsumptionKB": 3812,  
"directQueryTimeoutMs": 3600000, 
"tabularConnectionTimeoutMs": 3600000,  
"commandType": "Statement", 
"queryDialect": 3, 
"queryResultRows": 1613, 
"directQueryRequestCount": 11, 
"directQueryTotalRows": 3369 
}

Even though there are more SQL queries now the total number of rows returned by them is much less: the directQueryTotalRows metric is only 3369, so about 10% of what it was before. Why? Because instead of having to go down to the date granularity to do the calculations, the new calendar-based time intelligence functionality allows Power BI to do the calculation at the month granularity. Here’s a snippet of one of the SQL queries generated that shows this:

This in turn means that directQueryIterationTimeMs (now only 38ms), totalCpuTimeMs (now only 141ms) and approximatePeakMemConsumptionKB (now only 3812KB) are all much less than before. Also, this could mean you’re less likely to run into the Max Intermediate Row Set Count limit on the maximum number of rows that a DirectQuery SQL query can return and it opens up more opportunities to use aggregations to improve performance.

As a result, if you’re running into query performance, CU or memory-related problems in DirectQuery mode, you should experiment with using the new calendar-based time intelligence feature to see if it can help even if it results in more SQL queries being generated. Hopefully when those fusion optimisations are implemented in the future the benefits will be even greater.

Finally, it’s also worth mentioning that using Visual Calculations or Window functions (as discussed here) have very similar benefits when tuning DirectQuery mode, so you should check them out too and consider using them in combination with calendar-based time intelligence.

Power BI, Build Permissions And Security

If there is sensitive data in your Power BI semantic model that you don’t want some users to see then you need to use row-level security or object-level security to control access to that data. You’re an experienced Power BI developer – you know that, right? But what about Build permissions? If an end-user only has access to a report you’ve built and doesn’t have Build permissions on the underlying semantic model, and if there’s no other security on the semantic model, can they access data in the semantic model that isn’t visible in the report? The answer is potentially yes: you can’t rely on Build permissions for security.

Ever since Power BI was released there have been people who have published a semantic model and report, not used RLS or OLS, and then been surprised and upset when end users have been able to see all the data in the semantic model. A few years ago I wrote a blog post on one way this was possible, the “Show Data Point As Table” feature, and recent changes such as users with View permissions being able to use the Explore feature and the rise of Copilot have caused similar situations. But the fundamental truth is that RLS and OLS have always been necessary to make sure data is secure. Indeed if you check out the documentation for semantic model permissions you’ll see a note explaining that Build permissions are not a security feature:

Build permission is primarily a discoverability feature. It enables users to easily discover semantic models and build Power BI reports and other consumable items based on the discovered models, such as Excel PivotTables and non-Microsoft data visualization tools, using the XMLA endpoint. Users who have Read permission without Build permission can consume and interact with existing reports that have been shared with them. Granting Read permission without Build permission should not be relied upon to secure sensitive data. Users with Read permission, even without Build permission, are able to access and interact with data in the semantic model.

Build permissions do allow you to control whether users can create their own reports in Power BI Desktop or Excel and that’s useful when you want to stop the proliferation of reports, but it’s not security. I understand this is news to a lot of people – especially to many self-service developers who don’t read the docs closely – but that’s why I’ve written this post. If you have published semantic models containing sensitive data and were assuming that users would only be able to see the data displayed in reports then you need to go and implement RLS and OLS in your semantic model right now.

Incidentally, this is also why you are never going to see true page-level security implemented in Power BI reports even though a lot of people ask for it. The semantic model, not the report, is where security is applied in Power BI. All those blog posts and videos out there showing “workarounds” for page-level security are misleading because they are not truly secure. There’s a saying that “you can’t be half pregnant” and the same is true for security: you’re either secure or you aren’t and security-through-obscurity isn’t security. Whether or not a user can see a report page is irrelevant, what is important is whether the user can access the data in the semantic model.

Linking Queries Run From DAX Studio To Workspace Monitoring And The Capacity Metrics App

A few weeks ago I wrote a blog post about how you can now link event data from Workspace Monitoring to data in the Fabric Capacity Metrics App using OperationId values. In the latest (3.4.0) release of DAX Studio there’s a new feature that you might have missed that link queries run from there to Workspace Monitoring and the Capacity Metrics App in the same way.

I connected DAX Studio to a published Power BI semantic model, turned on Server Timings and ran a DAX query. When the query finished I clicked the Info button at the top of the Server Timings pane, which opened the Query Information dialog shown below, and then I copied the Request ID value from there:

The Request ID is the same thing as the OperationId value in Workspace Monitoring and the Capacity Metrics app. I was then able to use this value in a KQL query in Workspace Monitoring like so:

SemanticModelLogs
| where Timestamp > ago(1h)
| where toupper(OperationId) == "5D4A4F47-370D-4634-B67B-E4B58CB067A8"
| project Timestamp, OperationName, OperationDetailName, EventText, Status, StatusCode
| order by Timestamp asc

[Note that you need to convert the OperationID value in Workspace Monitoring to uppercase to match it to the Request ID from DAX Studio]

What’s the value of doing this? Probably not much for an individual query because the information you see in Workspace Monitoring is exactly the same as you get in Server Timings in DAX Studio, and DAX Studio shows this information in a way that’s much easier to understand. If you’re testing a number of DAX queries, though, then having all this data available in Workspace Monitoring means you can do detailed comparisons of different runs using KQL.

What is really useful is being able to find the CU usage on a capacity of a DAX query run from DAX Studio – something that is very important when tuning DAX queries. While DAX Studio (and Workspace Monitoring) will give you a CPU Time value for a DAX query, for reasons I won’t go into here you won’t be able to reverse engineer the algorithm that converts CPU Time to CUs. However by linking the RequestId from DAX Studio to the OperationIds seen on the Timepoint Detail and the new Timepoint Item Detail page in newer versions of the Capacity Metrics app (as discussed here) you don’t need to care about that. You can simply take the Request ID from DAX Studio and find it in the Capacity Metrics app. For example on the Timepoint Item Detail (preview) page you can either find the value in the OperationId slicer or add the OperationId column to the lower table using the Select Optional Column(s) slicer and look for it there:

Calling DAX UDFs From Power BI Copilot

Can you call a DAX UDF from Power BI Copilot? I was asked this question by Jake Duddy during the livestream on Power BI Copilot I did with Reid Havens last week. I already knew it was possible because one of the customers I work with had already tried it, but I hadn’t tried it myself. So I did, and it is possible, and here’s the blog post.

A few months ago I wrote a post about how you can put template DAX queries in your AI Instructions to show Copilot how to solve more complex problems that can only be solved with a custom DAX query. I took some of the code from that post and turned it into the following DAX UDF:

createOrReplace

	function ABC = ```
			(
				AUpperBoundary: SCALAR int64,
				BUpperBoundary: SCALAR int64,
				AnalysisDate: SCALAR datetime
			) => 
			VAR ApplyAnalysisDate = 
			CALCULATETABLE(
				'Transactions',
				'Date'[Date] = AnalysisDate
			)
			VAR AddGroupColumn = 
			ADDCOLUMNS(	
				ApplyAnalysisDate, 
				"Group",
				SWITCH(
					TRUE(),
					//If the price is less than or equal to AUpperBoundary
					//then return the value "A"
					Transactions[Price]<=AUpperBoundary, "A (<=£" & AUpperBoundary & ")",
					//If the price is less than or equal to BUpperBoundary
					//then return the value "B"
					Transactions[Price]<=BUpperBoundary, "B (>£" & AUpperBoundary & " and <=£" & BUpperBoundary & ")",
					//Otherwise return the value "C"
					"C (>£" & BUpperBoundary & ")"
				)
			)
			RETURN
			SUMMARIZE(
			    AddGroupColumn,
				[Group],
			    "Count Of Transactions", [Count Of Transactions]
			)

This UDF does a basic form of ABC analysis on the semantic model I’ve used in all my recent Copilot posts containing UK Land Registry data on real estate transactions:

Note: this is not great quality code and it’s certainly not a general purpose solution for ABC analysis in DAX, but it’s a UDF. I then added the following to my AI Instructions describing what the UDF does and how to use it:

This semantic model contains a DAX user-defined function called ABC that does an ABC analysis on the data in the Transactions table. It takes three parameters defined as follows:
AUpperBoundary - an integer value which is the upper boundary of transactions in the A group
BUpperBoundary - an integer value which is the upper boundary of transactions in the B group
AnalysisDate: a datetime value which is the date to filter transactions on
The function returns a table which can be used in an EVALUATE statement in a DAX query.

For example if I wanted to see the number of transactions which took place on 1st January 2025 divided into three groups:
A - transactions between £0 up to and including £250000
B - transactions above £250000 up to and including £700000
C - transactions above £700000
I could call the function as follows:
ABC(250000, 700000, DATE(2025,1,1))

Then I tried the following prompt:

Do an ABC analysis on the Transactions table, grouping all transactions into three groups named A, B and C based on the Price column where the upper boundary for the first group is £290000 and the upper boundary for the second group is £780000. Filter the transactions to just 14th January 2025.

Here’s the response I got:

As in that previous post, Copilot generated a DAX query to answer the question. Apart from that weird Sort Key column it’s the right answer; an additional AI Instruction telling Copilot to sort on the Group column might help here. The difference between this response and the one I got in my previous post is that this time the DAX query generated by Copilot simply calls the UDF. Here it is:

// DAX query generated by Fabric Copilot with "Perform an ABC analysis on the Transactions table, grouping all transactions into three groups named A, B, and C based on the Price column, where group A includes transactions with Price up to £290,000, group B includes transactions with Price above £290,000 and up to £780,000, and group C includes transactions with Price above £780,000. Only include transactions from 14th January 2025."
// ABC analysis for Transactions on 14-Jan-2025 with the specified price boundaries.
// A: Price <= 290,000
// B: 290,000 < Price <= 780,000
// C: Price > 780,000
// The ABC function is user-defined in the model and returns a table.
// Since the column names returned by ABC are not specified, add a dummy sort column to satisfy the ORDER BY requirement.
EVALUATE
  ADDCOLUMNS(
    ABC(290000, 780000, DATE(2025, 1, 14)),
    "Sort Key", 0
  )
ORDER BY
  [Sort Key] ASC

I guess the advantage of this over the DAX template approach is that it’s likely to be a lot more reliable and consistent: the less DAX Copilot has to generate, the more likely it is to do what you expect. On the other hand it’s a lot less flexible. For example I tried this prompt from my previous blog post which asks for an ABC analysis with four rather than three groups:

Do an ABC analysis on the Transactions table, grouping all transactions into four groups named A, B, C and D based on the Price column where the upper boundary for the first group is £300000, the upper boundary for the second group is £750000 and the upper boundary for the third group is £900000. Filter the transactions to just 16th January 2025.

I got the correct result from Copilot but the DAX query generated didn’t use the UDF because the UDF is hard coded to only return three groups; I suppose I was lucky in this case.

Based on these – admittedly fairly basic – tests I think using DAX UDFs with Power BI Copilot could be very useful when you need Copilot to generate complex measures or calculations where you know the general DAX pattern to use.

Monitoring The DAX Queries Generated When The Power BI Copilot Index Is Built

In my last post I talked about the text index that Power BI Copilot builds to help it answer data questions. You might be wondering if you can monitor the index build process and the bad news is that – at least at the time of writing – you can’t do so directly. However you can monitor it indirectly because the index build process runs DAX queries to get text values from the semantic model and you can see the DAX queries being run using Workspace Monitoring, Log Analytics or Profiler. While this is described quite well in the docs here let’s see what these DAX queries actually look like.

I published the semantic model from my last post (which has four text columns CustomerId and CustomerName from the Customers table, CustomerId and TransactionId from the Orders table, none of which are hidden or excluded from indexing using the Simplify The Data Schema feature) to a workspace where Workspace Monitoring was enabled:

I then ran the following KQL query to look at the DAX queries run in the workspace I published to:

SemanticModelLogs
| where Timestamp > ago(1hr)
| where OperationName == "QueryEnd"
| project Timestamp, EventText, DurationMs, CpuTimeMs

I saw that the index build generated three DAX queries with the same pattern:

Here’s the first of these queries:

EVALUATE
SELECTCOLUMNS (
    FILTER (
        VALUES ( 'Customers'[CustomerName] ),
        LEN ( 'Customers'[CustomerName] ) <= 100
    ),
    "valueColumn", 'Customers'[CustomerName]
)

In line with what is described in the documentation this query gets all the distinct text values from the CustomerName column on the Customer table that are less than 100 characters long. I assume there are only three DAX queries even though there are four text columns in the model because one of the columns, TransactionId, has 5 million distinct values in it and Copilot cannot currently index more than 5 million text values in a single model.

[Interestingly, I saw that one of these queries failed with a memory error which I have asked to be investigated]

Each query took between 300-600ms and there was a small gap between queries, so you can see how a large model with a lot of text columns could generate a lot of queries that all together take a noticeable amount of time to run. Finding the amount of time between the start of the first of these queries and the end of the last query will give you a rough idea of how long it takes to build the index, even though there is some extra work that needs to be done after the last query has been run which can sometimes take some time too.

As described here, indexing takes place for Import mode models with Q&A enabled every time the model is published, as well as every time it is refreshed so long as either Copilot or Q&A has been used in the last 14 days. For DirectQuery or Direct Lake models indexing also takes place every time the model is published but because it’s impossible to say whether the underlying data has changed when a refresh takes place, indexing happens instead every 24 hours so long as either Copilot or Q&A have been used in the last 14 days. As a result, if you’re an admin, you might see a lot of DAX queries similar to the one above being run.

[Thanks to Aaron Meyers for much of the information in this post]

Chris Webb's BI Blog

21st Blog Birthday: Centralised And Decentralised BI And AI

Like this:

Diagnosing Power BI DirectQuery Connection Limit Problems With Performance Analyzer

Like this:

Visualising Power BI Performance Analyzer Data With A Vibe-Coded Custom Visual

Like this:

Power BI Copilot And Report Filters And Slicers

Like this:

Stopping Power BI Copilot From Answering Questions From Report Visuals

Like this:

A Look At The Impact Of Calendar Based Time Intelligence On Power BI DirectQuery Performance

Like this:

Power BI, Build Permissions And Security

Like this:

Linking Queries Run From DAX Studio To Workspace Monitoring And The Capacity Metrics App

Like this:

Calling DAX UDFs From Power BI Copilot

Like this:

Monitoring The DAX Queries Generated When The Power BI Copilot Index Is Built

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: