Limit The Impact Of Expensive Power BI Queries On Your Capacity By Reducing The Query Timeout

The recent announcement of Surge Protection gives Fabric/Power BI capacity admins a way to restrict the impact of background operations on a capacity, preventing them from causing throttling. However, at the time of writing, Surge Protection does not prevent users that are running expensive DAX or MDX queries – which are interactive operations – from causing problems on your capacity. Indeed, right now, there is no direct way to stop runaway queries from consuming a lot of CUs, although there is something you can do which will help a lot: reducing the query timeout.

Surge Protection doesn’t address the problem of expensive queries yet because Power BI only knows the CU usage of a DAX or MDX query when it has finished running – by which time it’s too late to do anything about it. In many cases, though, DAX or MDX queries that consume a lot of CUs are also slow. Therefore reducing the query timeout, which will kill any query that runs longer than a specified duration, will stop these queries from consuming so many CUs.

There are two default query timeouts that you should be aware of in Power BI. First, all DAX queries generated by a Power BI report have a 225 second timeout applied by the report itself. This timeout can be changed in Power BI Desktop but it cannot be changed on a published report in the Power BI Service. Second, you can set a timeout at the capacity level by changing the Query Timeout property in the admin portal. The default setting here is 3600 seconds (one hour). Unlike the first timeout, which only applies to the DAX queries generated by a Power BI report, this timeout applies to all queries run on any semantic model associated with the capacity, including the MDX queries generated by Excel PivotTables via Analyze In Excel. Setting this second timeout to less than 225 seconds means that it will take precedence over the first timeout. Therefore it’s the Query Timeout property on your capacity that you should set.

Hitting a timeout in a Power BI report will give the user a “Query has exceeded the available resources” error; clicking See Details/More Details will give you a message like this:

The XML for Analysis request timed out before it was completed. Timeout value: 10 sec.

Hitting the query timeout in an Excel PivotTable will give you the same message:

What value should you set the Query Timeout to? In my opinion no query should ever run for more than 30 seconds because anything slower will result in a poor experience for your end users – no-one wants to sit around for ages waiting for a report to render. I also think it should be possible to tune any semantic model so all queries run under 30 seconds if you know what you’re doing. That said, in the real world, setting a timeout of 30 seconds may be unrealistic: developers may not have the skills to tune their semantic models. As a result I find a timeout of 100 seconds is often a good compromise but you should experiment with different timeouts to see what the minimum value you can get away with is.

It’s important to note that reducing the query timeout will not stop every expensive query. This is because it’s perfectly possible to have very fast queries that consume a lot of CUs – for example when distinct count measures are used, and/or when there are very large data volumes and/or when there are complex but highly-optimised measures. Also there relatively rare cases where a query will carry on running beyond the duration specified by the timeout, because the Vertipaq engine only checks if the timeout has been exceeded at certain points in the code and depending on the query there could be several seconds (sometimes more) between these checks. Equally, some very slow queries may not use a lot of CUs and having them time out might cause unnecessary disruption. Overall, though, in my experience setting a timeout will stop enough expensive queries to make doing so worthwhile.

[Update: my colleague Akshai Mirchandani has just reminded me that you can also set the Query Timeout at the workspace level as a Server Property using SQL Server Management Studio, as detailed here. The property is called ServerTimeout. This gives you more flexibility than setting it for the whole capacity.]

Migrating From Power BI P-SKU Premium Capacities To F-SKU Capacities Is Not The Same Thing As Enabling Fabric

Since the announcement in March that Power BI Premium P-SKUs are being retired and that customers will need to migrate to F-SKU capacities intead I have been asked the same question several times:

Why are you forcing me to migrate to Fabric???

This thread on Reddit is a great example. What I want to make clear in this post is the following:

Moving from P-SKU capacities to F-SKU capacities is not the same thing as enabling Fabric in your tenant

No-one is being forced to migrate from Power BI to Fabric and using F-SKU capacities does not mean you are using Fabric. Access to Fabric for your users is governed by the tenant-level settings documented here and these settings work the same way regardless of whether you’re using a P-SKU capacity or an F-SKU capacity. If you do not enable Fabric you can carry on using Power BI in exactly the same way as you did before, with exactly the same functionality, when you move to using an F-SKU capacity. Your users will not have the ability to create Fabric items like notebooks, warehouses, lakehouses and so on just because you’re using an F-SKU.

As the announcement blog post explains, moving to F-SKUs will involve changes about how and where you purchase your capacities and there will be some features that are only available in F-SKU capacities. Migrating workspaces to a new F-SKU capacity is fairly straightforward (and no different from moving a workspace from one P-SKU capacity to another) but if you have questions about how to perform the migration or how this affects how much you’re paying for Power BI you should contact your Microsoft account team.

Query Caching In Power BI Premium

The more you monitor something the more likely it is that you’ll see something strange you can’t explain. This is true of several customers I’ve spoken to recently who saw DAX queries run in the middle of the night by people who were definitely not online at the time and who were worried about the resulting load on their Premium capacities. What is the cause – hackers? Ghosts? In a lot of cases the culprit is actually Power BI’s query caching feature.

Query caching is a feature that is available in Premium and Fabric capacities for Import mode semantic models. It allows Power BI to pre-cache the data needed by some report visuals and dashboard tiles so the reports and dashboards open faster. The documentation here is pretty good but, as always, a few examples are useful to help understand how it works and there are a couple of extra things to point out that are not on the docs page I just linked to.

Let’s start with a simple pbix file called FruitSalesModel.pbix containing a single blank report page and one table in the Import mode semantic model:

The different fruit names will make it easy to link different visuals to different DAX queries. There’s also one measure called Sales Amount that sums the values in the Sales column.

If you publish this pbix to an otherwise empty workspace hosted on a Premium capacity and you’ll see that a semantic model and a blank report are created from it.

Query caching can be enabled on the semantic model by going to the model’s Settings page, expanding the Query Caching section and selecting “On”:

Enabling query caching now means that the Power BI Service may execute DAX queries to populate the cache immediately after a semantic model refresh takes place. If a query has been cached then, when an end user opens a report, Power BI can use the cached resultset for visuals which means that the report will open faster.

You can see how long all these queries take to run by looking at the Refresh History dialog and the details for an individual refresh (as mentioned in this blog post). The Query Cache line tells you when this query activity took place and how long it took:

At this point in our example, however, no queries will be executed because the only report connected to this semantic model is completely blank.

Which queries get cached? This is documented but I always prefer to run my own tests to see what happens; you can see the DAX queries being run after a refresh by running a Profiler trace on the semantic model.

If you edit the currently-blank report in the workspace so that there are two pages, add a single card to the first page that shows the value for the Sales Amount measure for the product Apples and a single card to the second page that shows the value for the Sales Amount measure for the product Grapes, make sure the first page is the page that is displayed when a user opens the report, and then close the report without going back to Reading View, and then refresh the semantic model, then a Profiler trace shows… no queries being executed after the refresh. This is because only visuals that are viewed by an end user in Reading View (not Editing mode) in the browser have their queries cached.

If you then open the report in Reading View and view both pages of the report and then refresh the semantic model again, you’ll see one query appear in the trace:

This is the query for the card on the first page of the report, for Sales Amount for Apples. The query for the card on the second page is not cached – only the queries for the visuals on the page that the report opens on are cached.

It’s also interesting to note that the ApplicationContext column in the trace has a value for the DatasetId but not ReportId or VisualId for these queries. This makes sense because these queries are not executed as a result of a report rendering.

If you then edit the report so it opens on the second page (which shows Sales Amount for Grapes) and not the first page, then refresh the semantic model again, the Profiler trace shows queries for both cards on both pages. In fact, even if you delete the first page from the report and refresh, both queries are still run:

The query for Sales Amount for Grapes is run because it’s now the query on the page of the report that opens first. The query for Sales Amount for Apples is still run because, once a query has been added to the list of queries to be cached, it will stay there until you either delete the semantic model or you turn the Query Caching setting on the semantic model off and on again.

It’s not just the visuals on the opening page of a report that are cached. If you pin any visual, from any page of a report, to a dashboard and then view that dashboard, the query for that visual will be cached. Also, if you create a personal bookmark and make that your default view when you open a report, then the queries for that personal bookmark will also be cached. Finally, if an end user opens a report, changes a filter or slicer on the opening page of a report, then closes the report and reopens it, the report will show the filter or slicer selection that was made when the report was last closed (this behaviour, known as “persistent filters”, is explained here). Queries for each user’s persistent filters on the opening page of a report will also be cached.

The behaviour above applies to all reports connected to a semantic model.

As a result, if your end users like creating dashboards or personal bookmarks, or if you have a lot of filters or slicers on the opening page of your report, or if you have lots of reports connected to a single semantic model, you can end up with a lot of queries being generated after the semantic model has refreshed when query caching is turned on. The good news for Premium capacity admins is that the queries run to populate the cache are classed as background operations and not interactive operations (as is the case for queries generated when a report renders). The CU cost of background operations is smoothed over 24 hours which means that you won’t get big spikes in usage on your capacity when a refresh takes place and the cache is populated, although if a lot of expensive queries are run this is definitely something you still need to monitor. Here’s a screenshot from the Capacity Metrics App showing DAX queries appearing as background operations:

In conclusion, query caching is a very powerful feature. Turning it on can make a big difference to the performance of your reports but you do need to understand how caching works to get the full benefit of it. You also need to make sure you watch out for it when monitoring a Premium capacity to make sure that it doesn’t increase your background utilisation too much.

Why You Should Defragment Your Fact Tables If You’re Using Incremental Refresh In Power BI Premium

A lot of people are building large datasets in Power BI Premium nowadays, and if you’re building a large dataset you’re very likely to be using Power BI’s incremental refresh functionality with your fact tables or manually creating and refreshing partitions in them. If so, you should occasionally do a special kind of refresh to shrink the dictionaries on your fact tables.

To illustrate why you need to do this I created a Power BI dataset with one table containing one column, where each value was a GUID (ie a unique text value). Here’s the M I used to generate it:

let
    Source = {1..10000000},
    Output = #table(type table [MyGUID=text], 
        List.Transform(
            Source, 
            each {Text.NewGuid()})
    )
in
    Output

After publishing the dataset I copied the original partition so there were two partitions in the table, each of which contained ten million rows. When you use incremental refresh in a table Power BI automatically generates and manages partitions for you in the background – and there isn’t much difference between doing this and creating partitions manually like I’m doing here.

Here’s what DAX Studio’s Model Metrics functionality showed when I first refreshed the dataset:

The Dictionary column is the column to focus on in this table. Dictionaries are the data structures created by Power BI’s storage engine, the Vertipaq engine, to hold all the distinct values in a column – the more distinct values there are, the larger the dictionary. As Marco points out here the DMVs that DAX Studio uses to measure dictionary size don’t always return consistent or accurate results but the values it returns are good enough for the purposes of this test.

I then did several Full refreshes on just one of the two partitions. Here’s what DAX Studio showed at this point:

As you can see, the size of the dictionary for the MyGUID column (and therefore the overall size of the dataset) has grown a lot – about 1GB – even though the cardinality of the column has stayed the same. This is because dictionaries are held at the table level, not at the partition level, so by refreshing just one partition I was adding 10 million new text values to the dictionary without deleting any of the old values that has been added to the dictionary by previous refreshes and which were no longer there in the table.

Finally, I ran the following Refresh command against the workspace’s XMLA Endpoint from SQL Server Management Studio:

{
  "refresh": {
    "type": "defragment",
    "objects": [
      {
        "database": "TableWithManyDistinctValues",
        "table": "MyTable"
      }
    ]
  }
}

[Note that XMLA Endpoints are only available if you’re using Premium or PPU]

This refresh is of type defragment and it is documented here. This is the explanation of what it does from the docs:

Defragment the data in the specified table. As data is added to or removed from a table, the dictionaries of each column can become polluted with values that no longer exist in the actual column values. The defragment option will clean up the values in the dictionaries that are no longer used.

After that you can see the dictionary for the table, and therefore the size of the dataset, was back to around its original size:

This is an extreme example of course, but I can imagine something nearly this bad happening in the real world if you use incremental refresh on a table with a text column containing addresses or user comments and if you don’t keep much history. This isn’t new behaviour or functionality: this happens in Analysis Services Tabular models too and is discussed in books like Marco and Alberto’s “Tabular Modelling in Microsoft SQL Server Analysis Services” but not many people are aware it still applies to Power BI datasets. This type of refresh can be quite slow but it is likely to be faster than the only other option (which is the only option if you’re not using Premium) which is to run a full refresh on the dataset.

So why should you run a refresh of type defragment on your dataset? There are two reasons:

If you’re getting close to the size limits imposed by the Premium capacity SKU you’re using, and you want to make sure that you don’t hit them.
Large datasets generally result in slower query performance, so reducing the size of your dataset may result in faster queries.

How often should you do this? In most cases not very often; it depends on how much your dataset shrinks when you do defragment. Once a year may be sufficient and I doubt anyone would need to do it more than once a month. One way to automate this would be using Power Automate to call the Enhanced Refresh API, as I showed in this series of posts.

Understanding The “Unable to save the changes since the new dataset of size” Error In Power BI

If you’re refreshing a Power BI dataset that contains a large amount of data you may encounter the following error:

Data source error Unable to save the changes since the new dataset of size 1120 MB exceeds the limit of 1024 MB. If the dataset is hosted on a PowerBI Premium capacity, you may also reach out to your capacity administrator to see if the limit can be adjusted.

You’ll get this error at the end of a refresh if the size of your dataset is larger than the size allowed by Power BI. The maximum size of a dataset varies depending on whether you are using Shared capacity (commonly known as “Pro”) or Premium capacity:

In Shared/Pro the maximum size of a dataset is 1GB (ie the 1024MB from the error message shown above), as documented here.
In Premium:
- The default maximum size of a dataset is 10GB if the “Large dataset storage format” option (documented here and here) is turned off for your dataset.
- If the “Large dataset storage format” option is turned on, the maximum size of your dataset depends on the amount of memory available in the SKU of the Power BI Premium capacity you’re using. There’s a table in the documentation that tells you how much memory is available for each SKU: look in the “RAM per dataset” column and also note what it says in the footnotes about the maximum size of a dataset being slightly less than the amount of memory listed.

What can you do if you run into this error? You have two choices:

If you’re using a Power BI Pro licence, buy Premium Per User licences or buy Premium capacity and move your dataset into a workspace on PPU or Premium. If you’re already on Premium, upgrade your capacity and make sure that the “Large dataset storage format” option is turned on for your dataset. This will increase the dataset size limits.
Reduce the size of your dataset so it fits within the limits.

Option #2 is likely to be the preferred option in most cases. There are a lot of detailed blog posts and articles out there on how to reduce the size of your dataset (including this one in the docs) but basically you should use DAX Studio’s Model Metrics feature to see what’s taking up the most space in your dataset then delete unnecessary tables and columns, try reduce the number of distinct values in each column, and reduce the number of rows you’re loading into each table (perhaps by reducing the amount of historic data). Quite often a few small changes can significantly shrink the size of your dataset and also help refresh times and query performance.

Cancelling Power BI Dataset Refreshes With The Enhanced Refresh API

The most exciting (at least for me) feature in the new Enhanced Refresh API (blog announcement | docs) is the ability to cancel a dataset refresh that’s currently in progress. Up until now, as this blog post by my colleague Michael Kovalsky shows, this has been quite difficult to do: not only do you need to use the XMLA Endpoint but you also need to take into account that in many cases the Power BI Service will automatically restart the refresh even after you’ve cancelled it. Now, though, if (and only if) you start the refresh using the Enhanced Refresh API you can also cancel it via the Enhanced Refresh API too. This is important because I’ve seen a few cases where rogue refreshes have consumed a lot of CPU on a Premium capacity and caused throttling, even after all CPU smoothing has taken place, and Power BI admins have struggled to cancel the refreshes.

This and all the other great functionality the new API includes (the ability to refresh individual tables or partitions! control over parallelism!) means that it can handle many of the advanced scenarios that, in the past, you’d have had to write some complex TMSL commands for; in my opinion anyone working on an enterprise-level dataset in Power BI Premium should be using it for their refreshes.

But Chris, I hear you say, I’m a data person and find working with APIs confusing and difficult! Yeah, me too – which is why, when I saw this tweet by Stephen Maguire about .NET interactive notebook for Visual Studio Code he’s built for the Enhanced Refresh API, I was interested:

https://github.com/samaguire/PowerBINotebooks

It’s a really great set of examples for learning how to use the Enhanced Refresh API through PowerShell and the notebook format makes it a lot more user-friendly than just another bunch of scripts. I highly recommend that you check it out.

Troubleshooting Power BI Timeouts, Part 3: The External Command Timeout

In the first post in this series I showed how any Power BI dataset refresh started via the Power BI portal or API is limited to 2 hours in Shared capacity and 5 hours in Premium capacity, and how you could work around that by running a refresh via Premium’s XMLA endpoint feature. In the second post in this series I showed how some M functions allow you to set timeouts. However, even if you initiate a refresh via the XMLA endpoint you may still get a timeout error and in this post I’ll discuss another reason why: the External Command Timeout.

This property is a hangover from Analysis Services (you can see it documented here). It represents the amount of time the Analysis Services engine inside Power BI will wait to get data from a data source. How it behaves exactly depends on the data source: it may limit the amount of time it takes to get the first row of data from the source or it may limit the amount of time it takes to get all the rows of data. In Power BI Premium it is set to five hours, which means that no single partition can take more than about five hours to refresh. In the first post in this series I worked around this by creating a dataset with multiple partitions, each of which took about an hour to refresh, but when trying to refresh a dataset with a single partition that takes more than five hours I got the following error when trying to refresh from SQL Server Management Studio through the XMLA Endpoint:

The error message here is:

Timeout expired. The timeout period elapsed prior to completion of the operation.. The exception was raised by the IDbCommand interface.

In this case I saw the same error in the Error event in Profiler:

…and in other cases, when testing a different source, I got a different error in Profiler in the Progress Report Error event:

The message here is:

Error processing partition ‘<pii>SlowCommand-2114bb81-69d3-4fe4-9d54-6b2661b0c497</pii>’ of table ‘<pii>SlowCommand</pii>’ [Internal Error Code: ‘0xc112001a’].

There’s no way to avoid the External Command Timeout. Instead, what you need to do is either change your partitioning strategy so each partition refreshes in under five hours or tune your data source, M code or gateway (if you’re using one) so that data is returned to Power BI faster.

In Shared capacity I believe the External Command Timeout is set to two hours (again, to match the overall refresh timeout) but it’s much less important there because you can’t create partitions manually (the only way a dataset in Shared can be partitioned is by setting up incremental refresh) and there’s no XMLA Endpoint so there’s no way to work around the two hour overall refresh limit anyway.

[Thanks, as always, to Akshai Mirchandani for a lot of the information in this post]

UPDATE August 2022: the ExternalCommandTimeout property can now be set as a workspace-level server property. For more details see https://powerbi.microsoft.com/en-us/blog/analysis-services-server-properties-in-power-bi-premium-are-now-in-public-preview/

Troubleshooting Power BI Timeouts, Part 1: The Two/Five Hour Import Refresh Limit

If you’re working with a large Power BI dataset and/or a slow data source in Import mode it can be very frustrating to run into timeout errors after you have already waited a long time for a refresh to finish. There are a number of different types of timeout that you might run into, and in this series I’ll look at a few of them and discuss some of the ways you can work around them.

In this post I’ll look at one of the most commonly-encountered timeouts: the limit on the maximum length of time an Import mode dataset refresh can take. As documented here these limits are:

Two hours for an import-mode dataset in Shared capacity
Five hours for an import-mode dataset in a Premium or PPU capacity

If you are using Premium you can connect to your dataset using SQL Server Profiler and run a trace when the refresh starts and you’ll see the timeout set as a property of the TMSL command that runs the refresh (18000 seconds = 5 hours):

Here’s an example of the message you will see in the Refresh History dialog if your dataset refresh takes more than five hours in PPU:

Once again you get more details in a Profiler trace. When the refresh times out you’ll see an Error event with long message that starts with the following text:

Timeout expired. The timeout period elapsed prior to completion of the operation.. The exception was raised by the IDbCommand interface. The command has been canceled..

There is a way to work around this limit if you’re using Premium or PPU: it only applies to refreshes (scheduled or manual) that you start from the Power BI portal or via the REST API. If you start your refresh by running a TMSL script via the XMLA Endpoint (for example via SQL Server Management Studio or from PowerShell) then the limit does not apply. This is because you’re executing your own TMSL script rather than having the Power BI Service execute its own refresh command – with a timeout set – like the one shown in the screenshot above.

For example, here’s the same dataset from the screenshot above but refreshed successfully from SQL Server Management Studio (notice the Type column says “Via XMLA Endpoint”) and with a duration of just over five hours:

There are a couple of blog posts out there showing how you can implement a refresh strategy using the XMLA Endpoint; this post from Marc Lelijveld and Paulien van Eijk is a great example of how to do this using Azure Automation and Azure Data Factory.

You should also consider tuning any refresh that takes a long time and it could be that after tuning it you fall well within the two/five hour limit. There are a lot of things to consider when tuning dataset refresh; I did a conference session on this topic last year (you can watch the recording here) covering issues like data modelling, query folding in Power Query and the use of dataflows but it doesn’t cover everything and I have learned a lot even since then. In my opinion one of the most important things you can do to improve refresh performance for very large Import datasets is to partition your fact tables and increase the amount of parallelism, as I describe here. One customer I worked with was able to reduce their refresh time from four hours down to one hour using this technique:

As I said, though, there are many different types of timeout that you may encounter – so even if you refresh via the XMLA Endpoint it may still time out for another reason.

Visualising Power BI Premium And Azure Analysis Services Query Parallelism

In my last post I showed how to connect SQL Server Profiler up to a Power BI Premium dataset but I didn’t give you any examples of why this might be useful. In this post I’ll show you how you can use a Profiler trace to visualise all the queries run by a Power BI report, see when they start to run, see which ones run in parallel with each other and see what the overall time taken to run all the queries is.

Why is this important? When you’re tuning the performance of a Power BI report the first thing to do is to look at the performance of the individual DAX queries run and make them run as fast as possible. However when a Power BI report is rendered any one query is likely to be run at the same time as several other queries run for the same report, and this will have an impact on its performance. How much of an impact there is will depend on how many queries need to be run and the number of back-end v-cores available on your Premium capacity, or the number of QPUs available on your Azure Analysis Services instance if you’re using a Live connection to AAS. The more v-cores/QPUs you have available, the more of the work needed for a query that can be run in parallel; you can see a table listing the number of v-cores for each Premium SKU here, and the number of QPUs for each Azure Analysis Services SKU here. As a result of this if you have reports with a large number of visuals that generate slow DAX queries, scaling up your Power BI Premium capacity or AAS instance may improve overall report performance. Reducing the number of visuals on your report and/or reducing the number of visuals needed to display the same information will also reduce the number of queries that need to be run and therefore improve overall performance.

As I showed last week, SQL Server Profiler can be used to create a trace that logs all the queries run against a Power BI Premium dataset in the same way as it can be used with Azure Analysis Services. Assuming that you have a trace running that uses only the Query End event, this will give you a list of all the queries that are being run along with their start time, end time, duration and a lot of other interesting information. A table with all this data in can still be difficult to interpret though, so I built a Power BI template for a report that visualises all these queries and helps you understand the amount of parallelism that is taking place. You can download the template file here.

To use it, first you need a trace file. Make sure that no-one else is running reports on the Premium capacity you want to test (creating a Power BI Embedded capacity for testing purposes is a good idea) and then, when the trace is running, refresh your report using the technique I described in the “Use the network tab” section of this blog post. This will also allow you to correlate what you see in the trace with the information you see in the DevTools tab in the browser.

Then save the trace file you can created to XML by going to File/Save As/Trace XML File:

SaveToXML

Next, open the Power BI template file and when prompted, enter the full path of the trace XML file you just created:

TemplateOpening

A new Power BI report will then be created. If you want to point the report to a different trace XML file all you need to do is change the value of the TraceXMLFile Power Query parameter.

On the first page you’ll see the name of the trace XML file you connected to plus a bar chart showing each Query End event (with each query identified by a number) on the y axis and the duration of each query on the x axis:

Waterfall

It’s not quite a simple bar chart though. What I’ve done is:

Found the start time of the first query run
Calculated the start time of every other query in the file relative to this first start time (although, unfortunately, Profiler only gives you start times rounded to the nearest second which means you can’t know exactly when a query starts)
Created a stacked bar chart where the first value in the stack is this relative start time and the second value is the duration of the query in seconds
Made the colour of the relative start time transparent, so you only see the blue sections of the bar for the query durations. This gives you a waterfall-like effect and allows you to see which queries are run in parallel. This also makes it easy to see the total amount of time taken to run your queries, from the start of the first query to the end of the last query, which is just as useful to know as the duration of any single query.
There’s also a drillthrough page so you can right-click on a bar and see a table with the DAX query for the query you clicked on, as well as its start time and duration.

It’s a very basic report, I know, and I would be interested to know if you have any ideas about other ways of visualising this data. What’s more, a visual like this raises more questions than I know how to answer… yet. For example, one thing I want to investigate is the effect that query interleaving has on this graph and both perceived and actual report performance. So stay tuned for more blog posts on this subject!

Connecting SQL Server Profiler To Power BI Premium

Back in December when I wrote a series of posts on testing the performance of Power BI reports in the browser, I mentioned that it was important to test in the browser because some aspects of the performance of a report may be different there compared to in Power BI Desktop. Following on from this, if you’re testing performance of a report in the browser you are also going to want to take a closer look at the DAX queries generated by your report, even if it is just to check that what you see there is the same as what you see in Performance Analyzer in Power BI Desktop. If your report uses a Live Connection to Analysis Services this is easy to do using either SQL Server Profiler, Azure Analysis Services’s diagnostic logging feature or XEvents. If you’re using a dataset stored in Power BI we have a range of options for monitoring what’s going on including Usage Metrics and the Premium Capacity Metrics Apps and of course there’s also DAX Studio, but for an old-school guy like me, connecting to a Power BI Premium workspace using SQL Server Profiler is a great way to go to get detailed information about what’s going on when queries run. In this blog post I’ll show you how to connect Profiler to Power BI Premium.

First you’ll need to install the latest version of SQL Server Management Studio, which includes SQL Server Profiler – you can get it here. If you have an older version installed you’ll probably need to upgrade. And before anyone leaves a comment about Profiler being deprecated, let me point you to the note on this page:

Note

The feature that allows you to connect Profiler to a Power BI Premium workspace is XMLA Endpoints: it’s in preview right now but basically this allows you to connect any tool that works with Azure Analysis Services up to Power BI Premium. SQL Server Profiler wants to connect to an instance of Analysis Services; XMLA Endpoints mean that you can connect it to a dataset in a Power BI Premium workspace but for this to happen you need to know the url for Profiler to connect to. You can find this by going to your workspace, clicking on Settings:

Workspace Settings

and then going to the Premium tab and copying the Workspace Connection string:

Workspace Connection

You can then open up Profiler, go to the File menu and select New Trace and a connection dialog will appear:

ProfilerConnection1

In this dialog:

Set the Server type to Analysis Services
In Server name paste the Workspace Connection that you copied from Power BI earlier
In Authentication select Azure Active Directory – Universal with MFA and enter your username

Next click the Options button and go to the Connection Properties tab and on the Connect to database dropdown select <Browse server…>:

ProfilerConnection2

Click Yes on the dialog that appears and then choose the name of the dataset in your workspace that you want to connect to in the Connect to database dropdown. If you don’t do this you’ll get errors later on.

Next you’ll see the Properties dialog:

ProfilerProperties1

The default template is Blank, which means no events are selected to monitor; to select events go to the Events Selection tab:

ProfilerProperties2

…select the events you want, and click Run to start tracing.

Trace

Which events should you choose? That’s a big topic and not one that I have time to go into here, but the Query End event is perhaps the one I look at most – it’s fired every time a query finishes executing and gives you a lot of important information such as the start time, end time and duration of the query in milliseconds. Books such as “The Definitive Guide To DAX” have a lot of information on using Profiler with Analysis Services and a lot of that information is relevant to Power BI Premium too. In future blog posts I dare say I’ll show you some interesting things you can do using Profiler and Power BI too…

Category: Premium

Limit The Impact Of Expensive Power BI Queries On Your Capacity By Reducing The Query Timeout

Like this:

Migrating From Power BI P-SKU Premium Capacities To F-SKU Capacities Is Not The Same Thing As Enabling Fabric

Like this:

Query Caching In Power BI Premium

Like this:

Why You Should Defragment Your Fact Tables If You’re Using Incremental Refresh In Power BI Premium

Like this:

Understanding The “Unable to save the changes since the new dataset of size” Error In Power BI

Like this:

Cancelling Power BI Dataset Refreshes With The Enhanced Refresh API

Like this:

Troubleshooting Power BI Timeouts, Part 3: The External Command Timeout

Like this:

Troubleshooting Power BI Timeouts, Part 1: The Two/Five Hour Import Refresh Limit

Like this:

Visualising Power BI Premium And Azure Analysis Services Query Parallelism

Like this:

Connecting SQL Server Profiler To Power BI Premium

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: