Power BI DirectQuery Mode: A Better Choice Than You Might Think

For as long as I’ve been using Power BI – which has been from the beginning – the advice about which storage mode to choose has been the same: use Import mode unless you have a really, really good reason to use DirectQuery mode and even then you’re probably wrong and should use Import mode. Import mode was always a lot faster and a lot easier to tune. Marco’s advice in this LinkedIn post from last year pretty much summed up my attitude and that of every other Power BI expert out there:

DirectQuery was only for situations where you either:

  • Had more data than you could fit in memory, which was very, very rare, or
  • Where you needed to see truly real-time data in your reports, which business stakeholders always claim they want but they rarely ever need

The addition of Direct Lake as a third storage mode in Fabric didn’t change the situation. While there are some really good reasons to use Direct Lake instead of Import mode (which I should really cover in a separate blog post because I don’t think they are explained properly anywhere), it didn’t change the advice around DirectQuery and I still didn’t think DirectQuery was a good option for most people.

I have now changed my mind. Import or Direct Lake should still be the default for most projects but DirectQuery is now the best choice when you’re working with larger data volumes that could still be handled by Import or Direct Lake. This is a controversial statement, I know, and it’s based on experiences with customers that I can’t talk about directly, and as always there are some important details, so let me explain myself.

My new advice is this. If:

  • You are starting a new project that is 100% on Fabric and all of your data will be loaded into a Fabric Lakehouse or Warehouse, and
  • You have the following:
    • Fact tables with more than a couple of billion rows in them, and
    • Dimension tables with more than a couple of million rows in them, and/or
    • Distinct count measures on columns with more than a couple of million distinct values in

Then you should start with a composite model design that uses:

  • DirectQuery mode fact tables on the Warehouse or SQL Endpoint of your Lakehouse
  • Dual mode dimension tables (which means they can swap between Import mode and DirectQuery mode when necessary)
  • Aggregation tables based on your fact tables that are either in Import mode or DirectQuery mode

Why am I recommending this? Well let me take Marco’s objections to DirectQuery mode one by one:

Is Import mode or Direct Lake mode always faster than DirectQuery? For the composite model scenario described above, no, not always. Even if your Import mode or Direct Lake mode model fits within the memory limits of the Fabric capacity that you’re using, then a composite model and DirectQuery fact tables can be almost as fast and may be faster. This is because:

  • If your query hits an aggregation table, which is potentially a lot smaller than the fact table, then it will be faster than if it hits an Import mode fact table. Of course if your aggregation tables are in Import mode then naturally your query will be faster than if it hits an Import mode fact table, so you could argue that this is an Import mode scenario anyway. But the native aggregation functionality is not available in Import mode or Direct Lake mode at the time of writing, it’s only available if the underlying table is in DirectQuery mode. And yes I know you can simulate aggregations in Import mode with some clever DAX but that clever DAX can also carry an overhead.
  • The Fabric Warehouse engine, which not only powers Fabric Warehouse but also the SQL Endpoint of a Lakehouse, has improved a lot in the last year and is now incredibly fast. When the new GPU acceleration and other upcoming performance features land then it will be even faster. For example, it’s already faster at doing distinct counts than the Vertipaq engine used by Import or Direct Lake models.
  • The Warehouse engine also has some architectural advantages over Vertipaq when it comes to concurrency. Now Vertipaq is already really, really good at concurrency and features like semantic model scale-out make it even better, but for the large data volumes mentioned above and when you have more than 5-10 concurrent users then Warehouse is better. It’s actually quite rare to have more than 5-10 genuinely concurrent users (ie users that are running queries at exactly the same time) in a BI solution but it does happen, for example at month-ends.

Is DirectQuery for the rich? Is DirectQuery always more expensive than Import mode or Direct Lake mode? Actually no, not if you’re using a composite model in Fabric for the scenarios we’re talking about.

Let’s say you’re building a large Import mode or Direct Lake mode model in Fabric and you hit the limits of the capacity SKU that you’re using. It could be that you’re hitting the memory limit for your SKU or one of the other Direct Lake guardrails. Or it could be that during your load testing (and you should always do load testing) you hit the CU limits of your capacity. What do you do? Well you can and should do some tuning to see if you can avoid hitting those limits. Or you can scale up to the next capacity size, although that might be expensive. The third option is to use a composite model with DirectQuery fact tables in the way I’ve described.

The first advantage of a composite model with DirectQuery fact tables here is that because only your dimension tables and any Import mode aggregation tables you have are subject to the memory limits for Power BI models that each capacity SKU enforces, you’re much less likely to hit those limits.

The second advantage of a composite model with DirectQuery is that is that if you hit your fact table then Power BI will generate SQL queries against the Warehouse engine and the SQL queries generated are counted as background operations which are then smoothed over 24 hours. DAX queries on Import mode or Direct Lake mode models are counted as interactive operations and are smoothed over 5 to 64 minutes. Queries against a DirectQuery table will still consume some interactive CUs and of course if you hit a Dual mode dimension table or an Import mode aggregation that will also consume interactive CUs, but the Storage Engine is where most of the CUs get burned in an Import or Direct Lake mode model. This is the main reason why a composite model approach can handle more concurrent users: you’re less likely to run into the CU limits for the capacity SKU you’re using because of the difference in how smoothing works.

What’s more, these advantages mean that even if you don’t hit the limits for the capacity SKU you’re using, you could still save money by using a composite model because it could allow you to use a smaller capacity.

You could argue that both of these advantages are purely accounting tricks, results of the rules that we at Microsoft have imposed on how Fabric capacities work, and you’d have a point. But it’s unlikely these rules will change anytime soon.

Is DirectQuery still difficult to manage? Yes, this objection still stands in my opinion. If you need to tune an Import model you need an expert in tuning Import models. They’re rare but they exist – you can call Marco for example or take one of his courses. If you need to tune a composite model like the one I’ve described you need someone who can tune an Import model, someone who can tune a DirectQuery model (which is really rare) and someone who can tune your Fabric Warehouse or SQL Endpoint. You might not have that combination of skills in your team, and if you do then it would make maintenance and development more expensive – which undermines the “DirectQuery is cheaper” argument a bit.

Finally, there are a couple of other important questions that need to be addressed.

I have Databricks, Snowflake or some other database and I’d like to use that instead of Fabric Warehouse or the Lakehouse SQL Endpoint. Is that a good idea? No, and I’m not just saying that because I work on the Fabric team and I want you to believe that Warehouse/SQL Endpoint is faster or better than them. It’s because the underlying architecture of Fabric and the connector that Power BI uses to connect to Fabric Warehouse/SQL Endpoint means that DirectQuery on Warehouse/SQL Endpoint is significantly faster and more scalable, even apart from the performance of the SQL queries themselves. More optimisations are planned to make the “better together” story even more compelling. And no, before anyone suggests it, we’re not deliberately trying to hobble the performance of other, non-Fabric databases – many other Microsoft data sources share the same architectural disadvantages when it comes to DirectQuery as Snowflake and Databricks. And while it is possible to make DirectQuery perform well for any data source, my point is that DirectQuery on Fabric Warehouse is a special case.

Also, much anecdotal evidence suggests that using Power BI in DirectQuery mode on non-Fabric sources can be more expensive than Import mode because you need to pay to use those other sources as well as pay for your Fabric capacity. DirectQuery mode on non-Fabric databases is more expensive in CU terms than DirectQuery on Fabric Warehouse/SQL Endpoint because of those architectural differences I mentioned and I’ve seen it even be more expensive than an Import model in terms of CU consumption. Meanwhile, as I’ve said, for larger volumes DirectQuery on Warehouse/SQL Endpoint can be cheaper than Import mode or Direct Lake.

Are there any other advantages to DirectQuery mode? Yes but maybe not ones you care about. Do you want genuinely dynamic calculated columns? It’s been possible for years with DirectQuery (see here – the example uses KQL but similar things are possible in SQL). One day I’ll get my demo showing how to do proper time zone conversion, handling daylight savings time, in DirectQuery. My colleague Mark Pryce-Maher has a nice demo of calling Fabric AI functions in a DirectQuery model. But these are all niche use cases.

Your size recommendations for when to use DirectQuery above are very vague. Can’t you be more precise? No, because so much depends on the design of your semantic model and reports and the nature of your data. In an ideal world you would still always choose Import mode or Direct Lake mode as your default and only shift to a DirectQuery composite model when necessary but doing that halfway through a project is quite disruptive. If you want to test the performance of Import mode, Direct Lake mode and DirectQuery mode for your project then go ahead, but if you don’t then for the volumes I’m talking about there’s a very good chance DirectQuery is the best option.

Last of all, I need to stress that this recommendation could change in the future. While Fabric Warehouse is getting a lot of improvements and optimisations, so is Direct Lake mode, so the price and performance characteristics of both will change a lot and that means the decision about which one is the best choice for larger data volumes may change too. I promise to update this post if and when that happens.

New Book: “Extreme DAX” 2nd Edition

“Extreme DAX” by Michiel Rozema, Madzy Stikkelorum and Henk Vlootman

Continuing my series of not-reviews of the latest Power BI-related books (and there are fewer and fewer of them being published) that I receive free copies of, here’s the second edition of a book that for some reason I missed first time around: “Extreme DAX”. The first few chapters cover familiar territory: data modelling, calculated columns, measures, time intelligence calculations etc. Of course there are several other books available that explain these concepts with the canonical text being “The Definitive Guide To DAX”; “Extreme DAX” does a good job with the basics and includes the latest additions to the language such as UDFs but I can’t help but think it would have been better to skip all this and assume that the reader knows it all already. Luckily things get a lot more extreme as the book progresses and it shifts to showing how to solve complex – but nevertheless real-life – Power BI problems with DAX. There’s a whole chapter on Auto-Exist for example which is pretty hardcore. For me this is where the real value of the book lies and there’s enough good content here for any advanced Power BI developer to want to buy it. Definitely worth checking out.

Improve Performance Of Staging In Fabric Dataflows Gen2 By Disabling V-Order

Quite a few new Dataflows Gen2 features were released recently without much fanfare, but that doesn’t mean they aren’t important. I will take a look at them all in my next few posts; in this first post I’ll look at the ability to disable V-Order on staged data.

As the (very detailed) documentation for this new feature describes, V-Order is a write-time optimisation for the parquet files that underpin the Delta tables that OneLake uses to store data. It slows down writing data to the tables but means that reading data from them, for example in Power BI Direct Lake mode, is much faster. It used to be the case that when you staged data inside a dataflow that data always had V-Order applied; now you have the option to disable V-Order. Disabling V-Order makes staging faster and because staged data is rarely queried more than a few times, disabling V-Order usually improves overall refresh performance.

To test this I created a simple dataflow that connected to a large (5.58GB) CSV file that contained 17.6 million rows of data, staged the data in a query called StageData, then did a group by on that data in a second query called GroupBy.

I turned off Fast Copy and left the “Enable V-Order compression” setting on:

[At the time of writing this post the ability to disable V-Order only works when Fast Copy is not used – I expect this to change in the future]

I refreshed the dataflow and it took 1 minute 59 seconds. The StageData query (where the staging takes place) took 1 minute 31 seconds; the GroupBy query took 12 seconds.

I then disabled V-Order compression for staging:

…and refreshed again. This time overall refresh took 1 minute 32 seconds, the StageData query took 1 minute 13 seconds and the GroupBy query took 7 seconds. While there is always a certain amount of variation in dataflow refresh timings it’s clear that disabling V-Order resulting in staging being about 20 seconds faster with no reduction in performance of the group by transformation on the staged data. So, in this case at least, disabling V-Order was a good thing for refresh performance.

When you decide whether to use staging in a dataflow you have to test to see whether the extra time needed to stage the data is worth it compared to the performance improvements you get by doing transformations on the staged data (which mostly come from those transformations having the opportunity to be folded). Since turning off V-Order makes staging faster it means that staging is more useful and will result in better overall dataflow refresh performance more often.

Fabric Dataflows Gen2 And Concurrent Evaluation, Part 3: Is More Parallelism Always Good?

To finish off my series of posts on concurrent evaluation in Fabric Dataflows Gen2 (see part 1 and part 2) I decided to do some more realistic tests to see how much parallelism I could get. To do this I uploaded 244 identical Excel files containing almost 542000 rows of data each to a SharePoint document library. Excel files are probably the worst-performing file format for dataflows (see here for some tests that show this), while SharePoint is probably the worst-performing place to store data for a dataflow and also has a reputation for throttling applications that make too many requests.

As a baseline I created a Dataflow Gen2 that combined all the data from all the Excel files in the document library and loaded it into a single table in a Fabric Warehouse; I configured this dataflow up so it did not use Partitioned Compute.

Looking at the Recent Runs data for this dataflow I could see that the query that combined the data from all the Excel files took on average 28 minutes 8 seconds to run.

I then created a copy of this dataflow and modified it so that it did use Partitioned Compute. I set the column containing the Excel filename as the partition key which meant that, in theory, the dataflow could read the data from each of the files in parallel. I then refreshed the dataflow with different maximum concurrency settings (see the previous post in this series for how to do this). Here are the results:

Maximum ConcurrencyAverage Refresh Duration (Minutes:Seconds)
418:37
89:42
165:24
2505:21
No limit set5:16

A few interesting things to note:

  • Refreshes with a maximum concurrency of 4 were significantly faster than the baseline although nowhere near four times faster, possibly because of the overhead introduced by Partitioned Compute, but
  • Refreshes with a maximum concurrency of 8 were around twice as fast as refreshes with a maximum concurrency of 4, and refreshes with a maximum concurrency of 16 were almost four times as fast
  • There was no performance gain achieved by setting a maximum concurrency of more than 16
  • Setting a limit of more than 16 resulted in the same performance as not setting a limit at all

One last thing to mention is that one of the refreshes when no limit was set failed after two minutes. It was difficult to tell why it failed but after downloading the logs and looking through them it seems that SharePoint was returning a 429 Too Many Requests error, which makes sense given that the dataflow was trying to read data from a lot of Excel files at the same time. Given that no data source, especially SharePoint, can be expected to handle an infinite number of parallel requests for data from a dataflow then this is a good example of where too much parallelism can be a bad thing and cause errors, and where it’s important to limit the amount of parallelism inside the dataflow.

Fabric Dataflows Gen2 And Concurrent Evaluation, Part 2: Partitioned Compute

In the first part of this series I showed how the Concurrency setting in a Fabric Dataflows Gen2 can affect refresh performance when there are multiple queries inside the dataflow. In this post I will show how, with Partitioned Compute, this setting can also affect the performance of a single query within a dataflow.

To test this I created a dataflow with one query, a modified version of the query that I used in this post from earlier this year which returns a table with ten rows and calls a function with a built-in delay of 60 seconds on each row.

Refreshing the dataflow with Partitioned Compute enabled and with the default concurrency settings:

Refreshing this dataflow took 1 minute 47 seconds:

Looking at the details for just the query with the delay, I could see that it took 1 minute 19 seconds – which suggests that every row was evaluated in parallel:

I don’t know if a concurrency of ten is something that will be achievable every time; based on what I saw in the tests from my previous posts the amount of concurrency with the default setting is variable.

I then changed the Concurrency setting to 1:

And then refreshed the dataflow again. This time the refresh took 11 minutes 31 seconds:

Looking at the details for the query with the delay I could see that it took 11 minutes 9 seconds:

This suggests that each row in the query was evaluated one after the other and shows that the Concurrency setting can be used to control the number of concurrent evaluations within a single query. As Partitioned Compute gets introduced in more and more scenarios in Fabric Dataflows Gen2 being able to control concurrency and tune performance will be more and more important.

Fabric Dataflows Gen2 And Concurrent Evaluation, Part 1

Did you know that if your Fabric Dataflows Gen2 contains several queries then you can control how many of them are evaluated in parallel when your dataflow refreshes? In this series I’ll look at how how you can do this and how it may result in better performance – at least in some cases.

Let’s start with the basics. I created a Dataflow Gen2 with ten queries which each returned a table of one row and one column after one minute. I used the #table function to generate the table without connecting to a data source, code from this post to add the delay and the trick in this post to make sure the delay was only applied when the dataflow refreshed. The output of each query was loaded to a Fabric Warehouse.

I then refreshed the dataflow using the default settings and found that the refresh took 2 minutes 9 seconds by looking in Recent Runs; each individual query took somewhere between 1 minute 10 seconds and 1 minute 30 seconds, which matches the 10-30 second overhead on query execution that I normally see when tuning dataflows. I refreshed it two more times and the durations were 1 minute 32 seconds and 3 minutes 38 seconds. This all suggests that the amount of concurrency was variable, and that the dataflow was sometimes able to evaluate all the queries in parallel and sometimes only able to evaluate some of them in parallel.

I then opened the dataflow, went to the Options dialog and the Scale pane, checked the “Limit number of concurrent evaluations” box and set the slider to 1:

I then refreshed the dataflow again and this time the overall refresh took 12 minutes 29 seconds, which is consistent with ten query evaluations that took 1 minute 10 seconds to 1 minute 30 seconds running one after the other.

Finally, I went back into the dataflow and set the Concurrency slider as high as possible to 250:

I then refreshed the dataflow again and this time the refresh took 1 minute 33 seconds, suggesting all the queries were evaluated in parallel.

So from all this we can see that, as you might expect, increasing the amount of concurrency improved dataflow refresh performance. There’s more to learn about concurrency though and as we shall learn later in this series, more concurrency isn’t always better.

Understanding The “You’ve Exceeded The Capacity Limit For Dataset Refreshes” Error in Power BI

If you have a lot of Power BI semantic models that are scheduled to refresh at the same time in the Service then you may find that some of them fail with the following error:

You’ve exceeded the capacity limit for dataset refreshes. Try again when fewer datasets are being processed.

[Note: “dataset” is the old name for a Power BI semantic model. Someone should update the error message.]

What causes it? Each Fabric or Power BI Premium capacity SKU can support (and “support” is the operative word here, as we shall see) a certain number of concurrent semantic model refreshes. These limits are documented here in the Model Refresh Parallelism column of the table on that docs page:

The error itself is documented here and I’ve mentioned it myself in a previous post here, but the interesting thing about the limit on the number of concurrent refreshes there’s a lot more to it than you might expect – Power BI is very forgiving.

Before I go any further, it’s important to make clear that this error is nothing to do with how many CUs you are using on your capacity at the time of the error, although the limits are in place to stop you overloading your capacity: running multiple semantic model refreshes at the same time could cause a sizeable increase in CU consumption even after smoothing.

For example, to investigate how this limit is applied I created an F2 capacity, added a workspace to that capacity, and uploaded several identical Power BI semantic models to that workspace. I used some Power Query magic to control how long those semantic models took to refresh.

For my first test I configured two semantic models so they took 120 seconds to refresh and started a manual refresh on both at the same time. Now, looking at the table above, you might think that because an F2 supports one concurrent semantic model refresh then I would get an error but no, both semantic models refreshed successfully and both refreshes took 120 seconds. The published limit is the number of semantic models that Power BI guarantees that can be refreshed concurrently; in practice the limit may be exceeded.

Next, I started a manual refresh on six semantic models that were all configured to take 120 seconds to refresh. Again, they all refreshed successfully and all took 120-122 seconds to refresh. Finally I started a manual refresh on fifteen semantic models that again were configured to take 120 seconds to refresh and this is where I saw something different. All of the semantic models refreshed successfully in the end, and none showed the warning triangle in the first screenshot above. Most of the semantic models took 120-122 seconds to refresh but some took longer. For example, take a look at this Refresh History for one of the models:

The overall refresh was successful but took 305 seconds, not 120 seconds. This is explained by the refresh failing immediately with the “You’ve exceeded the capacity limit for dataset refreshes” error, then the Service waiting a minute to retry the refresh (for more information on automatic refresh retries see here) which resulted in the same error occuring again, then the Service waiting for a further two minutes before retrying the refresh again, at which point it succeeded and took 122 seconds.

So you can see what I mean when I say Power BI is very forgiving about these limits. It’s also worth mentioning that scheduled refreshes don’t always happen at exactly the time they are scheduled for. The Service may wait several minutes after the scheduled time before it tries the first refresh. This is what is meant by the statement in the docs here that “You can schedule and run as many refreshes as required at any given time, and the Power BI service runs those refreshes at the time scheduled as a best effort.

In other tests with the same number of semantic models but longer refresh times, I was able to observe a scenario where a refresh scheduled for 17:30 did not start until almost eight minutes after that time and then failed nine times before it succeeded; note that the amount of time the Service waited to retry after the second failure went up to five minutes:

Of course Power BI can’t keep retrying indefinitely and eventually refreshes will fail with the “You’ve exceeded the capacity limit for dataset refreshes” error. Here’s the Refresh History for a semantic model where refresh ultimately failed after four retries (this took a lot of concurrent, slow refreshes to repro):

If you’re encountering this error then the solution is obvious: reduce the number of refreshes that are happening at any given time. But how do you know which refreshes are scheduled for when and how long they will take? The Refresh Schedule page for your capacity in the Admin Portal gives you a summary of the number of semantic models that are predicted to be refreshed in a 30 minute time slot and how long they are likely to take. The Fabric Monitoring Hub gives you details of historical activity. And if you have Workspace Monitoring or Log Analytics configured on your workspace you can get a lot of detail on what happens when refreshes are run, including seeing when the “You’ve exceeded the capacity limit” error occurs and refreshes retry.

Once you know what is being refreshed and when, you need to do two things. First see if you can reduce the number of times any given semantic model is refreshed. It’s pretty common for users to configure their model to refresh multiple times a day even if the actual data source only changes once a day, for example, so easy wins may be possible. Second, tuning the amount of time refreshes take can also reduce the amount of concurrent refreshes: if your semantic models refresh quickly it’s less likely it will overlap with other refreshes. Tuning data sources, tuning Power Query, increasing refresh parallelism, removing unnecessary columns or tables, tuning the DAX used in calculated columns and tables or replacing those calculated columns and tables with pre-calculated data in in the data source, and using incremental refresh are some of the things you will need to look at. Scaling up to a larger capacity, or buying an additional (possibly smaller) capacity and moving some workspaces over to it will also of course also solve the problem because the limits are per capacity.

In summary, what this shows is that the published, supported limits on the number of concurrent semantic model refreshes in the Power BI Service are a lot lower than what is achievable in practice. This is very important in self-service BI scenarios because it means refreshes are a lot less likely to fail than they would otherwise. But if you are refreshing a lot of semantic models, exceed the published limits on a regular basis and find some of your refreshes fail then you have no choice but to take some of the actions described above to get back under the limits.

Power BI Semantic Model Memory Errors, Part 5: The “Maximum Allowable Memory Allocation” Error

This is a very late addition to the series of posts I wrote back in 2024 and which started here on Power BI memory errors. It’s about a very rare error that is hard to deal with and often temporary but since people do run into it from time to time I decided to write about it so there is some useful information available about it online.

The error, which can occur when you refresh a semantic model or render a report, has two associated error messages:

The operation has been cancelled because there is not enough memory available for the application. If using a 32-bit version of the product, consider upgrading to the 64-bit version or increasing the amount of memory available on the machine.

or more commonly:

You have reached the maximum allowable memory allocation for your tier. Consider upgrading to a tier with more available memory

The error number associated with this error is 0xC11C0005 or -1055129595.

What causes it? This needs a bit of explanation and what follows is an over-simplification…

When you publish a Power BI semantic model to the Service it runs on one of hundreds of physical machines – nodes – alongside other semantic models published by other people. The Service always tries to put your semantic model on a node that has enough memory and CPU available for it to be queried or refreshed; if it decides that isn’t the case, the semantic model will be moved to a different node. The tricky thing is that the amount of memory or CPU available depends on whether the other semantic models on the same node are being refreshed or queried at any given time and how resource-intensive those queries and refreshes are. The limits on memory consumption that I wrote about in the previous posts in this series are there to stop any one semantic model consuming too much memory and causing problems for the other semantic models on the same node. While the algorithms used to determine which semantic models should be held together on a given node are very sophisticated (and are being improved all the time), sometimes something unexpected happens and the necessary resources aren’t available for a refresh or query. The errors above happen when the node your semantic model is being held on is under memory pressure.

What can you do about it? That’s a hard question to answer but it depends on whether your semantic model is part of the problem or not. If you only get this error once (and as I said, it’s a very rare error indeed) then you can ignore it – it’s just bad luck. However if you get this error repeatedly then it’s very likely that your semantic model is causing a memory spike and even if you aren’t hitting any other memory limit you are probably coming close and you should do some tuning. If you get this error when rendering a report you should look at the DAX queries generated by your visuals and work out whether you can reduce their memory usage by remodelling your data or rewriting the DAX in your measures. If you get this error when refreshing your semantic model you should see if you can reduce its size by remodelling your data or reduce memory consumption in other ways, for example by removing calculated columns or calculated tables and replacing them with columns and tables in your data source. For more information on how to measure memory consumption for a query or refresh, see the other posts in this series.

Connecting Power BI Semantic Models To Data Sources Automatically With Binding Hints

Did you know that you can configure your Power BI semantic model so that it automatically binds to a data source connection when you publish?

To illustrate how to do this, I created an Import mode Power BI semantic model in Power BI Desktop connected to the Products table in the ContosoSales sample database in the Azure Data Explorer help cluster. Anyone can connect to this source, you just need a Microsoft Account to authenticate. Here’s the M code from my semantic model:

let
Source = AzureDataExplorer.Contents(
"help",
null,
null,
[
MaxRows = null,
MaxSize = null,
NoTruncate = null,
AdditionalSetStatements = null
]
),
ContosoSales = Source
{[Name = "ContosoSales"]}
[Data],
Products1 = ContosoSales
{[Name = "Products"]}
[Data]
in
Products1

I then published the model to the Service but of course at that point I couldn’t refresh the model there without the extra step of connecting the newly published model to the source. As you would expect, going to the Settings pane for the semantic model gave me the option to link my data source to a connection in the Service

No surprises so far. I deleted the published semantic model and then did two things.

First I went to the Manage Connections page in the Service and created a new Shareable Cloud Connection for the Azure Data Explorer help cluster. I made a note of the connection ID:

Second, I opened my model in Power BI Desktop, scripted out the semantic model in TMDL View, then added the following Binding Hint to the model:

bindingInfo '{"kind":"AzureDataExplorer","path":"help"}'
type: dataBindingHint
connectionId: 42906b42-3e84-461f-aee4-f14fcbeb9b72

Two things to note here:

  • The name of the binding hint is a JSON representation of the connection. It consists of two parts: the kind, which is the type of connection (in this case a connection to Azure Data Explorer) and the path, which is a semi-colon delimited list of all the required parameters of the function used to connect to the source (in this case AzureDataExplorer.Contents). How do you work out what the kind and path are? Originally I worked it out through trial and error, looking at the diagnostic logs and metadata from the functions used to access data, and then I realised all the information was shown in the first screenshot above when the semantic model was not linked to a connection – the kind is shown as extensionDataSourceKind and the path is shown as extensionDataSourcePath. For reference, here’s what the name of a Binding Hint for a Snowflake connection looks like:
{"kind":"Snowflake","path":"xyz.snowflakecomputing.com;COMPUTE_WH"}
  • The connectionId is simply the connection ID from the Shareable Cloud Connection that the semantic model should be linked to.

I then hit Apply and was prompted to upgrade the model to a compatibility level of 1608 (Binding Hints are only available at that compatibility level and above) and clicked Yes:

Having done this, I then republished the semantic model and when I checked the Settings pane it was automatically connected to the Shareable Cloud Connection I had created and could be refreshed immediately:

You can add multiple binding hints if you have multiple connections. You can also add multiple binding hints for the same data source. All in all, this is a nice little feature that might be useful if you are programmatically generating and publishing semantic models and want to avoid an extra API call to bind your model to a data source.

[Update May 2026 – after talking to some of the engineers, I’ve been told that the Fabric list connections API is the best way to get the Kind and Path for a connection]

Generating Sample Data In Fabric Dataflows With FabricAI.Prompt()

Back in December the FabricAI.Prompt() M function was released in Fabric Dataflows Gen2. Most of the people writing about it at that time, as in this great post by my colleague Sandeep Pawar, focused on calling this function for each row in a table – something that the UI in the editor makes easy. However the FabricAI.Prompt() function itself is a lot more flexible. You can use it to summarise whole tables of data as I showed here; you can also use it to generate sample data. This is similar to what I blogged about here where I got Copilot to generate M code that returned sample data but using FabricAI.Prompt() is maybe a bit simpler.

The trick is to get FabricAI.Prompt() to generate a table of data in CSV format. Here’s an example prompt:

Generate a table of sample sales data in CSV format.
The table should have three columns called Country, Product and Sales.
The Country column should contain the names of random European countries.
The Product column should contain the names of random types of fruit.
The Sales column should contain random numbers between 1 and 100.
The table should contain 10 rows.
The first row of text returned should contain the column names.
Subsequent rows should contain the data.

…and here’s how this prompt can be used with Fabric.AIPrompt() and how the text that the function returns can be turned into a table using the CSV.Document function:

let
Source = FabricAI.Prompt(
"Generate a table of sample sales data in CSV format. The table should have three columns called Country, Product and Sales. The Country column should contain the names of random European countries. The Product column should contain the names of random types of fruit. The Sales column should contain random numbers between 1 and 100. The table should contain 10 rows. The first row of text returned should contain the column names. Subsequent rows should contain the data."
),
ToCSV = Csv.Document(Source),
#"Promoted headers" = Table.PromoteHeaders(
ToCSV,
[PromoteAllScalars = true]
),
#"Changed column type"
= Table.TransformColumnTypes(
#"Promoted headers",
{
{"Country", type text},
{"Product", type text},
{"Sales", Int64.Type}
}
)
in
#"Changed column type"

And here’s an example of the result: