When the new “Refresh a dataset” action for Power Automate (formerly Flow) was released last year I couldn’t help thinking there must be something really cool you could do with it – I just didn’t know what. There are lots of great resources explaining the basics (see Jon Levesque’s video here for example) and Adam Saxton did a nice demo here using the “When an item is modified” trigger to show how to refresh a dataset when a value in a SQL Server table is updated, but that’s it. So I got thinking about the types of problem it could solve and the fun I could have with it…
While Power BI’s scheduled refresh functionality is great, it doesn’t give you as much flexibility as you might need. For example:
You might want to schedule your dataset to refresh only on weekdays, not weekends, and you might also want to cancel refresh on certain days like public holidays. You might also only want to refresh a dataset on a monthly basis or at less than the half-hourly granularity that the UI allows. Why? Perhaps because it’s important to minimise the load you put on your source systems; it’s also the case that for many cloud data sources the more data you read, the more you pay.
If you have a lot of datasets to refresh you might want to control which datasets are refreshing in parallel, again to reduce the load on your data sources and if you’re using Premium, to reduce the load on your capacity. It’s hard to get an overview of when all your refreshes are scheduled in the Power BI Portal and manage what’s happening when.
The ideal way to view when multiple events are scheduled is a calendar and we’ve got great calendar functionality in Outlook. What if you could schedule refresh of your datasets from a calendar in Outlook? It turns out to be easier than you might think! Here’s how.
The first thing I did was create a new calendar in Outlook called Power BI Refreshes:
In this calendar I created appointments (either recurring or one-off) for every dataset refresh:
For each appointment, I entered the unique identifier of the dataset in the Title and the unique identifier of the workspace in the Location like so:
You can find these unique identifiers by going to the Settings screen for your dataset in the Power BI Portal and looking at the url:
Cathrine Wilhelmsen has more details on finding these ids here.
Last of all, I created a very simple Flow in Power Automate:
The “When an upcoming event is starting soon” trigger is fired when each of the appointments on the Power BI Refreshes calendar is about to start. It then passes the Location and Subject from the event – which of course contain the ids the workspace and dataset to be refreshed – to the Refresh a dataset action, which does the refresh.
This isn’t something I recommend putting into production but I think it’s very interesting as a proof-of-concept. I guess Logic Apps would be a more robust alternative than Power Automate and I would want to be 100% sure that events fired when I was expecting them to fire, so some thorough testing would be needed. I’m not experienced enough with Power Automate/Logic Apps to know if I’m doing the right thing, to be honest. I also feel like using ids in the meeting title and location is a bit hacky and there must be a nicer way of handling this.
On the Power BI side, it’s worth remembering that when a refresh is kicked off in Power BI the actual refreshing only starts when the Power BI Service has the required resources available, and especially in Shared capacity this can involve a wait of several minutes. What’s more the “Refresh a dataset” action does not know whether the refresh it kicks off succeeds or fails; I guess if you wanted to handle retries or notifications on failure then you would need to call the Power BI API get the refresh history of a dataset – there’s no built in action to do it, but it’s possible with a Power Automate custom connector.
If you have any thoughts about this, ideas on how to make this better or if you do put something like this into production, let me know – I would love to hear from you!
Following on from the series of posts I wrote last month (starting here) about testing the performance of Power BI reports in the browser, I would now like to highlight one possible cause of slow report rendering performance: the over-use of custom visuals. Before we go any further, though, let me be absolutely clear about something:
I am NOT saying that custom visuals are bad and that you should never use them.
However if you are using an older browser to view reports, have a slow PC and/or have a slow network connection, then having a large number of custom visuals on the same page can be bad for performance. In fact having a large number of any kind of visual on the same report page can be bad for performance, as Mike Carlo shows here. Even so there is an extra performance overhead when you use a custom visual instead of one of the built-in visuals.
Take for example, the following report. The dataset it uses has one table with one column and one row containing the value 1, so retrieving data from it is going to be super-fast. This page has 26 Advance Card custom visuals on it displaying the value 1 from the dataset:
Using Chrome DevTools and the functionality on the Network tab to simulate a cold cache and a fast 3G network (as I show here), this page takes 45 seconds to render. Another page on the same report has 26 instances of the built-in card visual to show the same data:
Under the same simulated network conditions, Chrome DevTools shows this page takes 25 seconds to render. So the version of the page with the custom visuals is 20 seconds slower to render than the version with the built-in visuals, although this is obviously an extreme example. Without the network throttling (but with the cold cache) the version with the Advance Card visuals loads in 8.5 seconds and the version with the built-in cards takes 5.5 seconds; still a noticeable difference but maybe not one that your users would complain about.
Let me be clear about something else here:
I am NOT saying that the Advance Card visual is badly implemented – it’s a great visual and one that is used by a lot of people without any problems. What I am saying is that any custom visual comes with some kind of performance overhead and the more custom visuals you use on a page, the more the overhead will be.
How much overhead there is will depend on a lot of different factors so you should test to see if there is any noticeable impact from using custom visuals in your own reports – it’s very likely that there won’t be one. What’s more, there are improvements being made to the Power BI Service all the time and it’s highly probable that some optimisations for the loading of custom visuals will be made in the future.
One last tip for you. If you have imported a custom visual in Power BI Desktop but are not using it, you should always delete it (as shown here). There is another, smaller overhead associated with even having a custom visual imported into a report – although, again, this may well change in the future.
Every year, on the anniversary of the first-ever post on this blog, I write a post summarising my thoughts on what’s happening in the world of Microsoft BI and what I’m up to professionally.
This year has seen bigger changes than most: in June I closed down my company, gave up being self-employed after thirteen years and took a job at Microsoft. I’m pleased to say that I don’t regret this decision at all and I’m really enjoying my new job. The work is more interesting (as a Power BI fanboy, what could be better than working for the Power BI team?), my colleagues are great, I’m travelling a lot less and as a result I’m feeling a lot more healthy and relaxed and I’m able to spend more time with my family. I am earning a bit less but overall there’s no question that going to Microsoft has been a change for the better. I’m not surprised that so many MVPs are doing the same thing these days: Microsoft is a great place to work right now.
The thing is that even after explaining my reasons I still get asked by people why I moved, as if I’m hiding something, because it seems the opposite of what most people hope to do – it’s a bit like the famously boring British Prime Minister of the 1990s, John Major, who as a child ran away from the circus to become an accountant. I guess there are a lot of people who dream about leaving corporate culture behind to be a freelancer. Now don’t get me wrong, I really enjoyed my previous life, but although freelancing has its benefits it’s not all great and there are many positives about working for a big company as well; I’m particularly happy that I don’t have to chase unpaid invoices any more, for example. One of these days I should really get round to writing about my experiences and how it’s possible to make a living as an independent Power BI consultant and trainer, to help anyone who’s interested in making the the jump to self-employment…
One last thing to reflect on is what I’ve learned in my first six months at Microsoft. I don’t think I know Power BI in any more depth than before I joined – it turns out there is no secret store of inside information that you get access to when you join the team – and all the PMs and devs that I used to bother with my questions before answer my questions with the same level of detail now. It just goes to show how well we treat our MVPs and customers! I do think I have a much broader understanding of how Power BI works and its architecture, though, and I now have a longer list of PMs and devs who I can ask questions. I also have a much better appreciation for how difficult some features are to implement and how tricky it is to prioritise features; next time you get frustrated at the lack of a particular feature in Power BI you can be reassured by the thought that it’s almost certain that someone on the team already knows about the issue and is working on it. Blogging is easier in some ways and more difficult in others: as I said, I now have more access to interesting information but not all of it is bloggable or indeed useful to anyone outside Microsoft, and there are things that I would have blogged about in the past that I won’t write about now (this is a good example) because they aren’t really intended for use by customers or practices that should be encouraged. Finally, getting to work on some of the biggest, most complex Power BI implementations in the world is an eye-opener. Sure, I used to get to work with customers who were doing interesting things before but the customers I work with now are at a different level, and from a technical point of view it’s really exciting.
So yes, life is good and I’m looking forward to 2020 and all the great things that planned for Power BI next year. Thanks for reading!
One new feature that was introduced in the December 2019 release of Power BI Desktop, but which wasn’t mentioned in the release blog post, is the “Diagnose Step” button on the Tools tab of the Power Query Editor ribbon:
As you might have guessed, it’s closely related to the Query Diagnostics functionality that was introduced back in October. Whereas the existing Query Diagnostics functionality allows you to see what happens inside the Power Query when a query is executed, this new feature does something similar but allows you to run a query up to a specific step. This is useful in scenarios where you want to reduce the diagnostics data you are collecting to a subset of the steps in the query without having to comment out a lot of M code.
To illustrate this, consider the following super-simple query with three steps:
If you click the Start Diagnostics button on the Tools tab of the ribbon of the Power Query Editor, refresh the query from the main Power BI window (as I show here), then click the Stop Diagnostics button, then the query will be run, the result of Step3 will be loaded into the dataset and the diagnostic data collected will be for the whole query.
However, if you select Step2 in the Applied Steps pane:
…and then click the new Diagnose Step button on the Tools tab of the ribbon, the query will only be executed up to Step2 (meaning any steps that Step2 is dependent on will also be evaluated) and the diagnostics data collected will reflect that.
In part 1 of this series I talked about why you might want to test the performance of your Power BI report in the browser; in part 2 I showed how you could test report performance in the browser; and in this blog post I’ll walk through an example of a report that is slow and show how to find out what the problem is.
Let’s say you are building reports for a chain of fast food restaurants and you have just created a new one showing sales of your products. It consists of one page with one rather smart-looking column chart visual on it:
…so you go ahead and publish. You view the report after publishing and it still seems fast. Then the complaints start coming in: the report is slow!?! It seems to be users who are viewing the report on their phone who are having the most problems. So, following the instructions in my last post, you open up Chrome DevTools and run an audit using a simulated slow 4G connection:
You get the following results:
They look pretty awful – 30.5 seconds for the report to render! Scrolling down on the audit results you also see the likely cause:
A large network payload, with the number one culprit a large jpg? So next you go to the Network tab and do a hard refresh of the report, disabling the cache and simulating a fast 3G connection:
Here’s what the report render looks like:
As you can see, it not only takes over 30 seconds to render the report, but worst of all the column chart is only visible right at the end. The waterfall shows something is being downloaded that is 1.2MB in size and that this takes 7.23 seconds:
Hovering over this request displays a tooltip that gives the full filename which, again, is a jpg file. The only image on the report is the background image used in the column chart and it turns out it is 1.2MB in size:
So it’s the background image used in the column chart that is the main problem! You remove the background image from the visual:
…and as expected, when you re-run the audit using the same settings as before the report renders seven seconds faster, taking 23.4 seconds:
Still not great, I know, but remember this is worst-case performance: not only does the audit recreate slow network and hardware, but it also recreates a cold cache – something your users will rarely encounter.
Hopefully this fairly simple example shows how useful the tools and techniques shown in the first two parts of this series are for troubleshooting certain types of report performance problem. It also goes to show how important it is to make sure any images you use in your report are as small as possible – something that, in my experience, many Power BI developers don’t always do.
In the first part of this series I talked about why you might want to test the performance of a Power BI report in the browser, as opposed to using Performance Analyzer in Power BI Desktop, and the different factors you have to take into account when doing this kind of testing. In this post I’ll describe how you can find out how long it takes for your report to run in a browser.
You can think of Power BI as a low code/no code tool for creating web applications for reporting and analytics, and as such you can use the same tools that web developers use for measuring web site performance to measure Power BI report render time. Now I am not a web developer (I guess most of you aren’t either) so I’ve had a lot to learn about this subject, and I’m all too aware of how much more there is still to learn, but I wanted to write up what I have learned so far because I think it’s so useful. Please excuse any mistakes, misconceptions or over-simplifications!
I’m going to use Google Chrome as my browser in this post. Yes, I know, I work for Microsoft but this is the new Microsoft and I have used Chrome as my browser of choice for the last few years; anyway, Edge is moving to be Chromium-based so everything I show in this post works in exactly the same way in the current Edge beta I have installed on my PC. As I said last time, it’s also important to test performance in whatever browser your end users use to understand what their experience is, but any optimisations you make in your report that work for Chrome or Edge beta are likely to work for other browsers too.
Here’s the methodology that I have come up with for measuring report performance in the browser:
Choose the page you want to test
You can only test the performance of one page in a report at a time, so choose the page that is either the most used, used by the most important users, or takes the longest time to render.
Add a blank page to your report
Before you publish your report, in Power BI Desktop add a blank page with no visuals on to it. It doesn’t need to be the page that is opened when the report opens and you will be able to delete it later. Why do this? When you’re testing how long it takes for your report page to render, you’re probably doing so because you want to improve performance. Some things in the report page that influence performance you have the power to change, such as the design of the dataset, the DAX in the measures, the number and type of visuals on a page; some things will always happen when a report runs and you have to accept that overhead. Testing how long a blank page takes to render will give you an idea of how long this latter category of “things that always happen” takes, and you can subtract this time from the time your chosen report page takes to run.
Get the embed url for your report page and the blank page
After you have published your report, get the secure embedding urls for your chosen report page and the blank page. You can find out how to do this here. Getting the embed url for the report takes a few clicks, but you will need to do a bit of copy-and-pasting to get the embed urls to open the report on the blank page and your chosen report page.
Why do this? Once again, it’s all down to focusing on what you have control over. When a user views a report they normally do so in the Power BI Service portal and the time this takes consists of the time taken to render the portal plus the time taken to render the report. Since you don’t have any control over how long the portal takes to render you might as well ignore this in your testing, so you can get a clearer picture of the amount of time taken to run the report. When you paste the embed url for the report into a browser tab the portal will not be displayed and you’ll only see the report.
Create a new user profile in Chrome
This is something I do anyway for each customer I work with, because it makes handling multiple Power BI logins much easier, but for performance testing you should create and use a new user profile in Chrome. This allows you to delete any Chrome extensions just for this profile (apparently they can affect page render times) and delete the cache if you want, without affecting your main profile. You may also need to test from an incognito window if there are some extensions you can’t delete.
Run your report pages to warm the SSAS/Power BI cache
Copy the embed urls for the blank page and the report page into new tabs and let them load before you do anything else, to warm the SSAS/Power BI cache. I discussed testing performance on a cold or warm SSAS/Power BI cache in my last post, and as I said then, while I believe it is extremely important to understand the impact of SSAS/Power BI caching on report rendering performance, it’s only really feasible to do the kind of DAX performance testing that requires this in Power BI Desktop. For testing report page rendering performance in the browser it’s easier to assume that you are running on a warm SSAS/Power BI cache.
Create an audit
The easiest way to measure report load time is to open Chrome DevTools (you can do this by clicking Control+Shift+j in Chrome) while on the browser tab with the report page you want to test and use the functionality on the Audits tab. To get an understanding of what the Audits tab does I recommend you read this page and watch the accompanying video, paying particular attention to the steps in the section on how to establish a baseline. To keep things simple I suggest you choose the Desktop option in the Device section (choosing Mobile does not seem to render the Phone layout of your report if you have created one); check the Performance box in the Audits section; and then choose to apply one of the throttling options if you want:
When you click the Run audits button various tests will be run, and after a short wait you’ll see the results:
You can read more about what these results mean here but there is one number that a Power BI developer should be particularly interested in: Time to Interactive. As far as I can see this is pretty much the amount of time it takes for the report page to render (because, of course, defining what “render” means is complicated). Subtract the amount of time it takes to render the blank page from the amount of time it takes to render your chosen report page and you will know how much of the report page render time you can actually influence with any changes you make. I strongly recommend you run tests both with and without network throttling applied, because applying throttling really highlights any inefficiencies in your report design.
Use the Network tab
If you want to peak behind the scenes and see more detail about what happens when your report page renders, as well as have more control over things like network throttling, you can use the Network tab in Chrome DevTools. You can find out how to use it here (with more details here) but all you really need to do is:
Open up a new tab with the embed url of the report page you want to test
Open Chrome DevTools and go to the Network tab
Make sure the Disable Cache box is checked and set up any network throttling options you want to use for your tests. The ability to add custom network configurations is particularly useful:
Press and hold the refresh button on the browser toolbar and a dropdown menu will appear; click on Empty Cache and Hard Reload to refresh the page and simulate a user’s first visit to the report. This will automatically start data capture in the Network tab.
When your report page has finished rendering, click on the red stop button in the top left hand corner of the Network tab. You need to be quick but not that quick, for reasons I’ll explain in the next bullet.
Down at the bottom of the Network tab you’ll see some useful metrics, including the number of requests made, the amount of data transferred and most importantly, the finish time. This last metric is not the time elapsed from the start of the refresh until the point you clicked the stop button; it’s the time elapsed from the start of the refresh until the last request before you clicked stop, so this is another good way of finding the render time of your report. You’ll see that requests do get made after the report has finished rendering which is why you need to be fairly quick clicking the stop button, but this is still a lot more accurate than using a stopwatch to time report page rendering.
The majority of the pane will be taken up with a list of requests made and a waterfall chart that visualises how long each request took.
Don’t get too clever!
You’ll see a lot of really interesting information on both the Audits tab and the Network tab and I’m sure most of you will be able to interpret what you see there and use it to tune your reports. I hesitated about mentioning the waterfall chart of the Network tab in this post but decided to do because I have found things on there that have helped me while performance tuning that I would not have found any other way. However, remember everything that you see on the waterfall chart in the Network tab is internal, undocumented and subject to change at any time. As a result please, please, please do not send me messages like “I’ve seen x happening on the Network tab, can you tell me what it means?” or “I’m a web developer and I can’t believe the stupid stuff I can see Microsoft doing here, why are you doing this?” because I’m not able to provide any answers.
As always, if you have any feedback or suggestions please leave a comment. In the next part of this series I’ll show you a worked example of how you can use all this to solve a performance problem that does not manifest in Power BI Desktop.
If your Power BI report is slow, most of the time the reason is something to do with the way you have modelled your data or the way you have written the DAX in your measures. There are plenty of resources out there to help you troubleshoot this kind of problem (Marco’s video “My Power BI report is slow: what should I do?” is a great place to start) but it’s important to understand that there are several other possible reasons why a Power BI report is slow, and these types of performance problem may only manifest themselves after the report has been published when a user is viewing it in the browser. As a result, it’s important to be able to test how long a report takes to render in a browser. If you can’t do this, how can you know if any changes you make help or hurt performance?
It turns out that testing performance of a report in the browser is not as straightforward as it seems. In this post I’m going to describe some of the factors you have to take into account when doing this type of testing; in the next post I’ll go into more detail about how you actually measure report rendering times in the browser and how to see what happens when the report is rendered.
Power BI is always changing
The first thing to remember is that the Power BI Service is always being updated and improved, which means that even if everything else in your environment stays the same the performance of any given report may vary (and hopefully get better) from day to day. Most changes to the Power BI Service are not publicised and there’s no way of knowing how much impact they will have on your report. You need to take this into account when you do testing: don’t create a benchmark and then wait a week to see if any changes you have made have improved performance. Instead create a new benchmark every day you are testing and compare the changes you make to it.
Use the browser your users are using
Report performance can vary a lot depending on the browser you’re using. Before you ask, I don’t have any recommendations for the best or fastest browser to use for Power BI but generally speaking the more ‘modern’ your browser the better; this tweet from my colleague Chris Hamill showing how much faster the new Chromium-based version of Edge is compared to regular Edge is really striking. If you’re going to test performance of a report, though, you need to use the browser(s) your users are using, especially if that browser is IE.
The spec of the PC or mobile device running the browser is also a major factor in performance. As a Power BI developer it’s likely your hardware will be far superior to whatever device your users have to view your report with. It may be a good idea to borrow a PC, laptop, phone or tablet that is equivalent to what your users use, just for testing purposes.
Users in different parts of the world with different internet connections will experience different report performance, and network conditions and therefore report performance may also vary by time of day. Power BI has some basic functionality to show the network latency that users are experiencing but it’s better to get someone who really knows what they are doing to help you understand what’s happening at the network level (I am absolutely not an expert in this). To a certain extent you can reproduce a different network conditions in the browser for testing purposes, but ideally you should to try to reproduce any problems and do any testing wherever the users are based and at the times of day they are using Power BI.
Power BI query caching
Datasets in Power BI Premium have an option to cache queries and improve performance for the initial page of a report. If you have this turned on, it means that the initial report load might be faster than subsequent interactions such as changing a filter or a slicer, so you should not just test the initial render time.
The Power BI/Analysis Services cache
Even if your dataset is perfectly modelled and your DAX is tuned to perfection, the caching that takes place within the Power BI engine (or Analysis Services, if you are using a Live connection) will almost always mean that the first time you run a query it will be slower than subsequent executions.
With Analysis Services Live connections you can clear the cache, and get worst-case cold-cache execution timings by running and XMLA ClearCache command from SQL Server Management Studio or by using DAX Studio’s Clear Cache button. You should also ensure that you are running your tests on an instance of Analysis Services that no-one else is connected to or doing anything on: this ensures that other users’ queries do not populate the cache, or use resources that affect the performance of your queries.
Things aren’t so simple with Power BI Import mode datasets – I’m pretty sure you can use XMLA to clear the cache of a Premium capacity via XMLA endpoints although I’ve not tried it, but even then it’s not likely you can ensure that you’re the only user on that capacity. With Shared capacity there’s no way to clear the cache. As a result you should instead accept that you can only test warm cache performance in the browser by loading your report once before you test performance. In any case, I think the kind of DAX-focussed testing that requires a cold cache can be done more easily in Power BI Desktop.
If you’re using a DirectQuery dataset then you also need to worry about caching, security and other users running queries in the underlying data source. Again, as far as possible you should try to ensure that no-one else is running queries or doing anything else in the data source that could have an effect on the performance of your queries.
Simulating activity from other users
With DirectQuery connections, Live connections and Import mode, it might be that other users’ activity or processing/refreshing of cubes/models/datasets is causing the performance problem. I still think it’s important to do any testing when there are no other real users around – if you want to simulate other users running queries while you do your testing there are tools that can do that, including the Premium capacity load testing tool.
Do what your users do
Sometimes performance problems only occur in very specific circumstances, for example when a specific item has been selected in a filter or more than one item is selected, so it’s really important to create tests that reflect real-world patterns of interaction. Don’t forget that users might also be subject to row-level security that you as a developer are not affected by.
As you will have realised by now, creating realistic, repeatable tests for Power BI reports in the browser is not an easy task because there are so many different things to take into account. I’m not going to pretend this is an exhaustive list (I haven’t mentioned factors such as using a dataset that has the same data volumes as your production dataset, because that’s not specific to testing reports in the browser) but if you think I have forgotten something please leave a comment below – I would be interested to hear your thoughts.
Click here for part 2 in this series, where I show how to measure report render time in the browser. Click here for part 3, where I show a worked example of how to use the tools and techniques described in part 2.
Recently I was involved in troubleshooting a mysterious Analysis Services Multidimensional performance problem for a customer: the team worked out that certain queries run by certain users were extremely slow, and that these users were members of roles where dimension security was applied, but the amount of slowdown – queries going through the role were taking over 10 minutes compared to a few seconds when run as an administrator – was unlike anything I had seen before. It turned out that the cause was having the Enable Visual Totals box checked on every attribute on the dimension where security was applied, not just the attributes whose members were secured.
I can’t reproduce the problem with the Adventure Works cube but I can use it to illustrate the problem. Let’s say you have a role that applies dimension security on the Country attribute of the Customer dimension:
Normally, in this scenario, you would only check the Enable Visual Totals box for the Country attribute:
When running a query with this role applied, in the Query Subcube Verbose event in Profiler you will see a slice is put on the Country attribute:
However, if the Enable Visual Totals box is checked for every attribute on the dimension then a slice is put on every attribute that has its hierarchy enabled:
The more of these slices there are the slower everything gets inside Analysis Services: slower scans, slower cache registry lookups and so on. In the case of the cube I was looking at the combination of all of these slices, extremely complex MDX calculations and unprocessed indexes led to the massive performance problem. Obviously if you have to use Enable Visual Totals on your role then you have to use it, and it’s extremely unlikely you will encounter this problem, but it’s good to know about it just in case.
Back in 2018 I wrote a blog post about how to troubleshoot web service refresh issues using a third party tool called Fiddler. It’s still relevant but Fiddler is a complex tool and installing it and giving it the right permissions to see everything it needs to see isn’t easy. Luckily, the new Power Query Diagnostics feature in Power BI (released in October 2019) means you don’t need to use Fiddler for this kind of work any more. Full details on how to use this feature are given here, but I thought it would be useful to focus on one particular use case for it.
Let’s take one of the Power Query queries from my previous post as an example:
Source = OData.Feed("https://api.parliament.uk/odata"),
#"Filtered Rows" =
each ([GroupName] = "Cabinet Office")
[Note that this is a public web service which requires no authentication, so you will be able to run this query yourself]
This query connects to the UK Houses of Parliament OData API and returns the table of government organisations filtered down to the row where the GroupName column equals the text “Cabinet Office”. What requests are made to the web service when this query runs?
The first thing to point out is that different things will happen depending on whether you refresh the preview window in the Power Query Editor or refresh the dataset inside the main Power BI window. If you’re doing performance tuning you should ignore what happens when you refresh the preview in the Power Query Editor and always refresh from the main Power BI window.
The second is that, when you refresh a dataset (or a table within a dataset) from the main Power BI window, you might find that more work is being done than you expect as a result of “background refresh” taking place. This is something I blogged about here and you should always turn it off before doing any performance tests, because it’s not something that will happen in the service and it can also slow things down – potentially a lot if there are lots of queries and steps. You can turn it off from the Options dialog in the Data Load tab by deselecting “Allow data preview to download in the background”.
The third is that the Query Diagnostics feature I’m going to use is, at the time of writing, in Preview, so you may need to enable it in the Preview features pane of the Options dialog:
With that done, here are the steps to find out what requests are made to the web service when the above query is run.
First, let the query load data into the dataset as usual without doing anything else. You’ll see the table that the query has loaded data into in the Fields pane of the main Power BI window:
Next, reopen the Power Query Editor window, go to the Tools tab and click on the Start Diagnostics button.
Do not do anything else in the Power Query Editor window, instead minimise (do not close) it and then go back to the main Power BI window. There, right-click on the table that holds the output of the query in the Fields pane, right-click and select “Refresh data”:
Once the refresh has taken place, go back to the Power Query Editor window and click the Stop Diagnostics button:
When you have done that, you’ll see two new queries created in a group called Diagnostics:
These queries contain the diagnostics information for the refresh you’ve just done. The query with “Detailed” in its name has all the detailed information; the other query contains summarised data.
Looking at the “Detailed” query, each row represents a single operation that takes place within the Power Query engine when the refresh took place and while I don’t know what all this means (I’m working on it!), if you go to the Data Source Query column you’ll see a list of all the HTTP requests made to the web service:
Clicking on an individual cell in this column allows you to see the full text of the request and response made in the preview pane underneath the results pane:
While Fiddler may still be necessary for some more advanced scenarios, this is a much easier way of troubleshooting web service data sources in the Power Query Editor, for example when you need to check the exact request made to see if query folding is taking place on an OData data source.
Among all the exciting announcements made at Ignite last week, one you may have missed (even in all the Excel-related announcements here) was that dynamic arrays have finally reached GA. Ever since they were announced I’ve been interested in how they can be used with Excel cube functions, which allow you to get data from Analysis Services, Power BI and the Excel Data Model/Power Pivot into cells on the worksheet, and I’ve blogged about this once already. Even though right now the CubeValue function can’t be used with dynamic arrays – alas – there are still some interesting ways dynamic arrays and cube functions can be used together and in this post I’ll show you another one I’ve found.
Say you have the following set of tables in the Excel Data Model:
And the following measure that counts the number of customers who bought something:
Let’s also say you have on a worksheet the following table of data showing the number of distinct customers broken down by product, created using the CubeMember and CubeValue functions:
Here are the formulas for this table:
So far, all very straightforward. Now let’s say you want to use a dynamic array to filter the rows in this table so you only see the products that have more than a certain number of distinct customers, and you want to see a grand total for this filtered list of customers, like so:
Here are the formulas to achieve this:
To explain what’s going on here:
Cell C2 contains the text “ThisWorkbookDataModel”, the name of the connection to the Excel Data Model. Doing this makes the other Excel formulas here much easier to read!
Cell F2 – which I’ve turned into a named range called FilterThreshold – contains the number of customers which a product must exceed to be shown.
Cell E5 contains the dynamic array Filter function that filters the rows shown in the first two screenshots above and only returns the products that exceed the threshold. The formula is:
Cell E6 does something similar to E5 but returns the number of distinct customers for each product. If all I wanted was the products and the number of distinct customers I wouldn’t have needed two separate formulas, I could have used one, but I need to add the grand total too…
Cell E4 contains the formula:
CUBESET($C$2, E5#, “Total”)
This creates a set from the spill reference E5#, which contains the cells containing CubeMember formulas returned by the Filter function. For some reason CubeSet returns an error if you try to use a dynamic array formula in its second parameter but using a spill reference works ok; this is why I had to use separate formulas in E5 and E6.
Finally, cell F4 contains the formula that returns the total distinct count for all products returned in the filtered list using CubeValue and the set created in cell E4:
CUBEVALUE($C$2, “[Measures].[Distinct Customers]”, E4)
Since this is a distinct count the only way to get the correct value here is using CubeSet, CubeValue and the DAX measure – there would be no way to calculate the correct value using Excel formulas. The same could be said of almost any other DAX measure that wasn’t a simple aggregation.
You can download the sample workbook for this post here.