The team at SQLBits have been publishing all the session recordings from their last (online) conference on their YouTube channel. There’s a lot of great content there to check out and this post is to highlight one of my sessions, “Performance tuning Power BI dataset refresh”.
In this session I look at all of the factors that can influence how long it takes to import data into Power BI and what you can do to make it faster. Topics covered include:
Choosing a dataset storage mode
The importance of good data modelling
How the type of data source you use effects how quickly data can load
Ways to measure refresh performance, such as using SQL Server Profiler and Power Query Query Diagnostics
Power Query options that can influence refresh times such as disabling data previews
Query folding in the Power Query engine
Vertipaq engine features that affect refresh, such as calculated columns and calculated tables
When you’re faced with a slow Power BI report it’s very easy to assume that the problem is something to do with the dataset, how it’s modelled and how the DAX has been written, and then disappear down a rabbit hole of Marco-and-Alberto videos trying to shave milliseconds of the time taken by each DAX query the report runs. It’s certainly an assumption that I make all the time, and indeed most performance problems can be fixed in this way. However this can mean that you miss other important ways of improving report performance.
Two things happened last week to remind me of this. First of all I spent some time with my colleague Miguel Myers (check out his YouTube channel) who showed me how he had tuned a customer’s report and reduced one page’s time from 27 seconds to 10 seconds without making any changes to the dataset or DAX while still showing the same information on the page. He did this partly by using a smaller number of visuals to display the same values – something I blogged about here when I learned this from him earlier this year – but mostly by reducing the number of visuals like shapes and textboxes that don’t display any data at all. Secondly, I was helping some other colleagues with load testing a Power BI report using this tool, and trying to understand whether there was a performance bottleneck inside Power BI or on the client machines running the load test; it turned out that it was the latter problem.
You may have read the title of this post and guessed that I’m going to talk about reducing the number of visuals that display data from your dataset as a way of improving performance, but that’s not the case. In this blog post I want to show how visuals that do not display any data from your dataset can have a significant impact on report performance. Before we carry on I suggest you read the series of posts I wrote late last year on measuring the performance of reports in the browser using Chrome/Edge DevTools (part 1 is here, part 2 is here, part 3 is here) because I’ll be using techniques described in these posts in my testing.
I have a Power BI dataset with a single table in it and that table contains one column and one row:
Here’s a report page with a single card on it that displays that single value:
As you might think, this page is very quick to render. Using a secure embed link and the Lighthouse tab in DevTools, the Time To Interactive reading on a cold cache is 3.7 seconds (which, remember, is a worst-case test and slower than most users will experience).
Now, consider a second report page in the same pbix file with the same card visual on it, but this time with 260 rectangle visuals added:
These rectangle visuals don’t display any data. The Time To Interactive reading for this page is now a whopping 24.3 seconds! What’s more, Power BI Desktop is noticeably more sluggish when editing this page.
OK, this is an extreme example but it does go to show how adding too many visuals to your report pages, even when those visuals do not display any data, can have a noticeable impact on report performance. What if you need to use lots of shape visuals to provide visual grouping in your report though? The easy solution is not to use lots and lots of shapes but to create an image with the outlines you want and set that as the page background. It can be made to look exactly the same but in this case, using an image instead of all those shapes results in a page that has a Time To Interactive reading of 4.5 seconds – only a bit slower than the original page. Most users won’t be able to spot the difference either:
Nothing of what I have said so far is particularly new or ground-breaking, and indeed another one of my colleagues, Chris Hamill, wrote about this last year in a great post on how he uses PowerPoint to design backgrounds for Power BI reports. There is one last question that needs to be considered though, and it’s the question I was confronted with this week during the load testing: how do you know if you have this problem without completely redesigning your reports? Even if you don’t think you have a particularly cluttered report, is there still an opportunity to shave a second off your report load time by removing visuals? The information you see on the Network tab in DevTools can give you a hint.
Here’s what the Network tab (see this post for how to use it) in DevTools shows for the first report page above with only the card visual on it:
Focusing on the events in the 1.5 second period after the red line shown in the timeline at the top (which is the load event for the page) and before the first of the querydata requests (which are fired when the visuals request data from the dataset, one of the last things that happen in a report render), you’ll notice there are no significant gaps.
Compare this with the same part of the Network waterfall, a period of almost 14 seconds, for the version of the page with the 260 rectangles on: