21st Blog Birthday: Centralised And Decentralised BI And AI

As the saying goes, history doesn’t repeat itself but it rhymes. While 2025 has seen the excitement around new advances in AI continue to grow in data and analytics as much as anywhere else, it’s also seen the re-emergence of old debates. In particular, one question has raised its head yet again: should there be a single, central place for your data to live and your security, semantic models, metrics and reports to be defined, or should you take a more distributed approach and delegate some of the responsibility for managing your data and defining and maintaining those models, metrics and reports to the wider business?

At first the answer seems obvious: centralisation is best. If you want a single version of the truth then all your data, all your semantic models and all your metrics should be centralised. Anything else leads to inconsistency between reports, inefficiencies, security threats and compliance issues. But while this is a noble ideal and is very appealing to central data teams building an empire I think history has already proved that this approach doesn’t really work. If it did, Microstrategy and Business Objects would have solved enterprise BI twenty years ago and all companies would have a long-established, lovingly curated central semantic model, sitting on an equally long-established, lovingly curated central data warehouse, that all business users love to use. That’s not the case though and there’s a reason why the self-service revolution of Tableau, Qlik and ultimately Power BI happened: old style centralised BI owned by a centralised data team solved many problems (notably the problems of the centralised data team) but not all, and most importantly not all those of the business users. I’m not saying that those older tools were bad or that centralised BI was a total failure, far from it; at best they provided an important set of quality-controlled core reports and at worst they were a convenient place for users to go to export data to Excel. But no-one can deny that those older tools died away for a reason and I feel like some modern data platforms are repeating the same mistake.

In contrast the Power BI approach – and now the approach of Fabric – of empowering business users within an environment where what they are doing can be monitored, governed and guided might seem dangerous but at the end of the day it’s more successful because it is grounded in the reality of how people use data. You can still manage your most important data and reports centrally but you have to accept that a lot, in fact most of the work that happens with data happens away from the centre. “Discipline at the core, flexibility at the edge” as my boss likes to say. This is as much a question of data culture as it is the technology that you use, but Power BI and Fabric support this approach by offering some tools that are easy to use for people whose day job might not be data and by being cheap enough to be enabled for a mass audience of users, while also providing other tools that appeal to the data professionals.

Central data teams sometimes think of their business users as children, and as a parent if you saw your six year-old pick up a bottle of vodka and try to take a swig you’d snatch it out of their hands in the same way that some data teams try to limit access to data and the tools to use with it. Business users aren’t children though, or if they are they are more like my pretty-much grown up children, and you can’t take that bottle of vodka away from them. If you do they’ll just go to the shops, buy another one and drink it out of your sight. Instead you can make sure they are aware of the dangers of alcohol, you can set an example of responsible consumption, you can educate them on how to make sophisticated cocktails as an alternative to drinking the cheap stuff neat. And while, inevitably, they will still make mistakes (think of that spaghetti Power BI model that takes four hours to refresh and two minutes to display a page as the equivalent of a teenage hangover) and some may go off the rails completely, as an approach it’s more likely to be successful overall than total prohibition in my experience.

This is an old argument and one you’ve heard before I’m sure. Why am I talking about it again? Well apart from the fact that, as I mentioned, some vendors are selling the centralise-everything dream once more, I think we’re on the verge of another self-service BI revolution that’s going to be even bigger than the one that happened fifteen or so years ago and maybe as big as the one that happened when Excel landed on desktop PCs forty years ago, a revolution driven by AI. Whether I like it or not or whether it will lead to better decisions or not is irrelevant, it’s coming. Developers whose opinion I trust like Jeffrey Wang are already saying how it’s transforming their work. More importantly I’ve tried it, it let me do stuff I couldn’t do before and even if the quality was not great it did what I needed, and most of all it was fun. Once business users whose job it is to crunch data get their hands on these tools (when the tools are ready – I don’t think they are quite yet), understand what they can do, and start having fun themselves it will be impossible to stop them. An agent grabbing a chunk of data from your centralised, secure data catalog and then taking it away to who-knows-where to do who-knows-what will be the new version of exporting to Excel. Already a lot of the BI managers I talk to are aware of the extent that their business users are feeding data into ChatGPT on their personal devices to get their work done, even if company rules tell them not to. We need to accept that business users will want to use AI tools and provide a flexible, safe, governed way for these new ways of working with data to occur.

No data platform is ready for this future yet because no-one knows exactly what that future will look like. I can imagine that some things will be familiar: there will probably still be reports as well as natural language conversations and there will probably still be semantic models behind the scenes somewhere. How those reports and semantic models get built and who (or what) does the building remains to be seen. The only thing I am sure of is that business users will have more powerful tools available to them, that they will use these tools and they will get access to the data they need to use with these tools.

Fabric Data Agents: Unlocking The Full Power Of DAX For Data Analysis

Now that Fabric Data Agents (what used to be called AI Skills) can use Power BI semantic models as a data source I’ve been spending some time playing around with them, and while I was doing that I realised something – maybe something obvious, but I think still worth writing about. It’s that there are a lot of amazing things you can do in DAX that rarely get done because of the constraints of exposing semantic models through a Power BI report, and because Data Agents generate DAX queries they unlock that hitherto untapped potential for the first time. Up until now I’ve assumed that natural language querying of data in Power BI was something only relatively low-skilled end users (the kind of people who can’t build their own Power BI reports and who struggle with Excel PivotTables) would benefit from; now I think it’s something that will also benefit highly-skilled Power BI data analysts as well. That’s a somewhat vague statement, I know, so let me explain what I mean with an example.

Consider the following semantic model:

There are two dimension tables, Customer and Product, and a fact table called Sales with one measure defined as follows:

Count Of Sales = COUNTROWS('Sales')

There’s one row in the fact table for each sale of a Product to a Customer. Here’s all the data dumped to a table:

So, very simple indeed. Even so there are some common questions that an analyst might want to ask about this data that aren’t easy to answer without some extra measures or modelling – and if you don’t have the skills or time to do this, you’re in trouble. One example is basket analysis type questions like this: which customers bought Apples and also bought Lemons? You can’t easily answer this question with the model as it is in a Power BI report; what you’d need to do is create a disconnected copy of the Product dimension table so that a user can select Apples on the original Product dimension table and select Lemons on this new dimension, and then you’d need to write some DAX to find the customers who bought Apples and Lemons. All very doable but, like I said, needing changes to the model and strong DAX skills.

I published my semantic model to the Service and created a Data Agent that used that model as a source. I added two instructions to the Data Agent:

  • Always show results as a table, never as bullet points
  • You can tell customers have bought a product when the Count of Sales measure is greater than 0

The first instruction I added because I got irritated by the way Data Agent shows the results with bullet points rather than as a table. The second probably wasn’t necessary because in most cases Data Agent knew that the Sales table represented a sale of a Product to a Customer, but I added it after one incorrect response just to make that completely clear.

I then asked the Data Agent the following question:

Show me customers who bought apples and who also bought lemons

And I got the correct response:

In this case it solved the problem in two steps, writing a DAX query to get the customers who bought lemons and writing another DAX query to get the customers who bought apples and finding the intersection itself:

At other times I’ve seen it solve the problem more elegantly in a single query and finding the customers who bought apples and lemons using the DAX Intersect() function.

I then asked a similar question:

For customers who bought apples, which other products did they buy?

And again, I got the correct answer:

In this case it ran five separate DAX queries, one for each customer, which I’m not thrilled about but again at other times it solved the problem in a single DAX query more elegantly.

Next I tried to do some ABC analysis:

Group customers into two categories: one that contains all the customers with just one sale, and one that contains all the customers with more than one sale. Show the total count of sales for both categories but do not show individual customer names.

And again I got the correct answer:

I could go on but this post is long enough already. I did get incorrect answers for some prompts and also there were some cases where the Data Agent asked for more details or a simpler question – but that’s what you’d expect. I was pleasantly surprised at how well it worked, especially since I don’t have any previous experience with using AI for data analysis, crafting prompts or anything like that. No complex configuration was required and I didn’t supply any example DAX queries (in fact Data Agents don’t allow you to provide example queries for semantic models yet) or anything like that. What does this all mean though?

I’m not going to argue that your average end user is going to start doing advanced data analysis with semantic models using Data Agents. The results were impressive and while I think Data Agents (and Copilot for that matter) do a pretty good job with simpler problems, I wouldn’t want anyone to blindly trust the results for more advanced problems like these. However if you’re a data analyst who is already competent with DAX and is aware that they always need to verify the results they get from Data Agent, I think this kind of DAX vibe-coding has a lot of value. Imagine you’re a data analyst and you’re asked that question about which products customers who bought apples also bought. You could search the web, probably find this article by the Italians, get scared, spend a few hours digesting it, create a new semantic model with all the extra tables and measures you need, and then finally get the answer you want. Maybe you could try to write a DAX query from scratch that you can run in DAX Studio or DAX Query View, but that requires more skill because no-one blogs about solving problems like this by writing DAX queries. Or you could ask a Data Agent, check the DAX query it spits out to make sure it does what you want, and get your answer much, much faster and easier. I know which option I’d choose.

To finish, let me answer a few likely questions:

Why are you doing this with Fabric Data Agents and not Power BI Copilot?

At the time of writing Data Agents, the Power BI Copilot that you access via the side pane in a report and Power BI Copilot in DAX Query View all have slightly different capabilities. Power BI Copilot in the side pane (what most people think of as Power BI Copilot) couldn’t answer any of these questions when I asked them but I didn’t expect it to because even though it can now create calculations it can still only answer questions that can be answered as a Power BI visual. Copilot in DAX Query View is actually very closely related to the Data Agent’s natural language-to-DAX functionality (in fact at the moment it can see and use more model metadata than Data Agent) and unsurprisingly it did a lot better but the results were still not as good as Data Agent. Expect these differences to go away over time and everything I say here about Data Agents to be equally applicable to Power BI Copilot.

This isn’t anything new or exciting – I see people posting about using AI for data analysis all the time on LinkedIn, Twitter etc. What’s different?

Fair point. I see this type of content all the time too (for example in the Microsoft data community Brian Julius and my colleague Mim always have interesting things to say on this subject) and I was excited to read the recent announcement about Analyst agent in M365 Copilot. But typically people are talking about taking raw data and analysing it in Python or generating SQL queries. What if your data is already in Power BI? If so then DAX is the natural way of analysing it. More importantly there are many advantages to using AI to analyse data via a semantic model: all the joins are predefined, there’s a lot of other rich metadata to improve results, plus all those handy DAX calculations (and one day DAX UDFs) that you’ve defined. You’re much more likely to get reliable results when using AI on top of a semantic model compared to something that generates Python or SQL because a lot more of the hard work has been done in advance.

Is this going to replace Power BI reports?

No, I don’t think this kind of conversational BI is going to replace Power BI reports, paginated reports, Analyze in Excel or any of the other existing ways of interacting with data in Power BI. I think it will be a new way of analysing data in Power BI. And to restate the point I’ve been trying to make in this post: conversational BI will not only empower low-skilled end users, it will also empower data analysts, who may not feel they are true “data scientists” but who do have strong Power BI and DAX skills, to solve more advanced problems like basket analysis or ABC analysis much more easily.

Power BI/AI Book Roundup

Here’s another one of my occasional posts about books I’ve been sent free copies of. Full disclosure: as always, these aren’t reviews as such, they’re more like free publicity in return for the free books and I don’t pretend to be unbiased; also the Amazon UK links have a affiliate code in that gives me a kickback if you buy any of these books.

The AI Value Playbook, Lisa Weaver-Lambert

What am I doing covering an AI book here? Lisa is an ex-colleague of mine at Microsoft and I respect her opinions. Also, I suspect like a lot of you, I have mixed feelings about the current AI boom: I can see the value in AI but I can also see the vast amount of hype and the obviously ridiculous claims being made. More than anything I see senior executives talking confidently about a subject I’m sure they don’t understand, and that is clearly a big problem. This book aims to help solve that problem by providing a practical guide to AI for non-technical leaders, in the form of a series of case studies and interviews with entrepreneurs and C-level people in the AI space. This is a very readable book – Lisa has talked to a lot of interesting, knowledgeable people – and the format makes it a lot more palatable for the target audience of your boss’s boss’s boss than your average tech book. As a technical person who isn’t by any means an AI expert I also enjoyed reading it.

The Complete Power BI Interview Guide, Sandielly Ortega Polanco, Gogula Aryalingam and Abu Bakar Nisa Alvi

Spend any time on public Power BI forums and you’ll see a lot of questions from people who want to know how to start a career in Power BI or get tips for Power BI interviews; as a result I’m sure there’s a big market for a book like this. It’s a mix of technical topics (the type that you might be asked about in a technical interview for a Power BI job) and non-technical advice such as how to network on LinkedIn, negotiate salaries and acecpt or reject a job offer. That might seem a bit of a strange combination but it works and the advice is both detailed and very sensible, so as a result I would have no hesitation in recommending this to anyone trying to get a job as a Power BI developer.

Project Sophia: An AI-Powered Business Research Canvas

I don’t know how, but somehow I missed the announcement of the preview of Project Sophia at Ignite (it’s not built by the Fabric product group and I’ve been distracted by all the cool stuff we’re working on at the moment, so…). If you’re a fan of Power BI and data in general you are definitely going to want to check it out though! It’s not BI, it’s an “AI-Powered business research canvas” although that description doesn’t really do justice to it; even if you read the announcement blog post you probably won’t get what it does. You need to watch this demo video first, watch this Ignite session if you have more time, read the docs and try it for yourself here if you’re based in North America.

Here’s a quick summary of how it works. You upload some data to work with (you can use it with sample data too), ask it a question, it generates some visual and text-based answers, and then suggests next steps or allows you to click on some data and ask a further question. Each new step in the journey creates a new page of information; you can easily go back and see previous pages.

I fed a subset of the UK Land Registry price paid data into it and while it was far from perfect I was still extremely impressed with the results:

Before you ask the obvious question: at the time of writing it only connects to relatively small Excel, CSV and PDF files. And please don’t ask me any follow up questions about this because I won’t be able to answer them 🙂