In my last post I showed how to connect SQL Server Profiler up to a Power BI Premium dataset but I didn’t give you any examples of why this might be useful. In this post I’ll show you how you can use a Profiler trace to visualise all the queries run by a Power BI report, see when they start to run, see which ones run in parallel with each other and see what the overall time taken to run all the queries is.
Why is this important? When you’re tuning the performance of a Power BI report the first thing to do is to look at the performance of the individual DAX queries run and make them run as fast as possible. However when a Power BI report is rendered any one query is likely to be run at the same time as several other queries run for the same report, and this will have an impact on its performance. How much of an impact there is will depend on how many queries need to be run and the number of back-end v-cores available on your Premium capacity, or the number of QPUs available on your Azure Analysis Services instance if you’re using a Live connection to AAS. The more v-cores/QPUs you have available, the more of the work needed for a query that can be run in parallel; you can see a table listing the number of v-cores for each Premium SKU here, and the number of QPUs for each Azure Analysis Services SKU here. As a result of this if you have reports with a large number of visuals that generate slow DAX queries, scaling up your Power BI Premium capacity or AAS instance may improve overall report performance. Reducing the number of visuals on your report and/or reducing the number of visuals needed to display the same information will also reduce the number of queries that need to be run and therefore improve overall performance.
As I showed last week, SQL Server Profiler can be used to create a trace that logs all the queries run against a Power BI Premium dataset in the same way as it can be used with Azure Analysis Services. Assuming that you have a trace running that uses only the Query End event, this will give you a list of all the queries that are being run along with their start time, end time, duration and a lot of other interesting information. A table with all this data in can still be difficult to interpret though, so I built a Power BI template for a report that visualises all these queries and helps you understand the amount of parallelism that is taking place. You can download the template file here.
To use it, first you need a trace file. Make sure that no-one else is running reports on the Premium capacity you want to test (creating a Power BI Embedded capacity for testing purposes is a good idea) and then, when the trace is running, refresh your report using the technique I described in the “Use the network tab” section of this blog post. This will also allow you to correlate what you see in the trace with the information you see in the DevTools tab in the browser.
Then save the trace file you can created to XML by going to File/Save As/Trace XML File:
Next, open the Power BI template file and when prompted, enter the full path of the trace XML file you just created:
A new Power BI report will then be created. If you want to point the report to a different trace XML file all you need to do is change the value of the TraceXMLFile Power Query parameter.
On the first page you’ll see the name of the trace XML file you connected to plus a bar chart showing each Query End event (with each query identified by a number) on the y axis and the duration of each query on the x axis:
It’s not quite a simple bar chart though. What I’ve done is:
- Found the start time of the first query run
- Calculated the start time of every other query in the file relative to this first start time (although, unfortunately, Profiler only gives you start times rounded to the nearest second which means you can’t know exactly when a query starts)
- Created a stacked bar chart where the first value in the stack is this relative start time and the second value is the duration of the query in seconds
- Made the colour of the relative start time transparent, so you only see the blue sections of the bar for the query durations. This gives you a waterfall-like effect and allows you to see which queries are run in parallel. This also makes it easy to see the total amount of time taken to run your queries, from the start of the first query to the end of the last query, which is just as useful to know as the duration of any single query.
- There’s also a drillthrough page so you can right-click on a bar and see a table with the DAX query for the query you clicked on, as well as its start time and duration.
It’s a very basic report, I know, and I would be interested to know if you have any ideas about other ways of visualising this data. What’s more, a visual like this raises more questions than I know how to answer… yet. For example, one thing I want to investigate is the effect that query interleaving has on this graph and both perceived and actual report performance. So stay tuned for more blog posts on this subject!