Measuring Power Query CPU Usage During Power BI Dataset Refresh

Some time ago I wrote a post about how optimising for CPU Time is almost as important as optimising for Duration in Power BI, especially if you’re working with Power BI Premium Gen2. This is fairly straightforward if you’re optimising DAX queries or optimising Analysis Services engine-related activity for refreshes. But what about Power Query-related activity? You may have a small dataset but if you’re doing a lot of complex transformations in Power Query that could end up using a lot of CPU, even once the CPU smoothing for background activity that happens with Premium Gen2 has happened. How can you measure how expensive your Power Query queries are in terms of CPU? In this post I’ll show you how.

Let’s consider two Power Query queries that return a similar result and which are connected to two different tables in the same Power BI dataset. The first query returns a table with one column and one row, where the only value is a random number returned by the Number.Random M function:

#table(type table [A=number],{{Number.Random()}})

The second query also returns a table with a single value in it:

let
  InitialList = {1 .. 1000000},
  RandomNumbers = List.Transform(
    InitialList,
    each Number.Random()
  ),
  FindMin = List.Min(RandomNumbers),
  Output = #table(
    type table [A = number],
    {{FindMin}}
  )
in
  Output

This second query, however, generates one million random numbers, finds the minimum and returns that value – which of course is a lot slower and more expensive in terms of CPU.

If you run a SQL Server Profiler trace connected to Power BI Desktop and refresh each of the two tables in the dataset separately, the Command End event for the refresh will tell you the duration of the refresh and also the amount of CPU Time used by the Analysis Services engine for the refresh (there will be several Command End events visible in Profiler but only one with any significant activity, so it will be easy to spot the right one). In Desktop, however, the Command End event does not include any CPU used by the Power Query Engine. Here’s what the Command End event for the first Power Query query above looks like in Desktop:

As you would expect the values in both the Duration and CPU Time columns are low. Here is what the Command End event looks like for the second query above:

This time the refresh is much slower (the Duration value is much larger than before) but the CPU Time value is still low, because the Analysis Services engine is still only receiving a table with a single value in it. All the time taken by the refresh is taken in the Power Query engine.

If you publish a dataset containing these queries to a Premium workspace in the Power BI Service, connect Profiler to the XMLA Endpoint for the workspace, and then refresh the two tables again then for the first, fast query you won’t notice much difference:

[Note that in this screenshot I’ve chosen a comparable Command End event to the one I used in Desktop, although for some reason it doesn’t show the duration. The overall refresh duration, which includes some extra work to do a backup, is around 2 seconds]

However, for the second, slower query you can see that the CPU Time for the Command End event is much higher. This is because in the Power BI Service the event’s CPU Time includes all the Power Query-related activity as well as all Analysis Services engine activity:

This is a simple example where there is very little work being done in the Analysis Services engine, which means that pretty much all the CPU Time can be attributed to the Power Query engine. In the real world, when you’re working with large amount of data, it will be harder to understand how much work is being done in the Analysis Services engine and how much is being done in the Power Query engine. This is where Power BI Desktop comes in, I think. In Desktop you know you are only seeing the CPU used by the Analysis Services engine, so I’ll bet that if there is a big difference in the ratio of CPU Time to Duration for your refresh in Power BI Desktop compared to the Power BI Service, it’s highly likely that that difference is due to Power Query engine activity and that’s where you should concentrate your optimisation efforts.

Of course the next question is how can you optimise Power Query queries so they use less CPU? I don’t know, I haven’t done it yet – but when I have something useful to share I’ll blog about it…

17 thoughts on “Measuring Power Query CPU Usage During Power BI Dataset Refresh

  1. Agree, good read. I’ve mostly been doing trial and error improvements to my M, have been thinking about adding meta data with datetime stamps to each row processed so I can see when rows (steps / Functions) start taking longer, and do min/max to get total run times. I do 90% of my work in Dataflows so anything on M performance tweaking is interesting!

  2. Awesome post, thanks! Also, does using an on-premise data gateway affect these results at all? And would we be able to tell which usage is counting against our premium gen2 capacity vs being executed on the gateway?

    Thanks,

    Alex

  3. Thanks for blogging about CPU usage in the service.

    CPU’s are a limited resource in our PBI premium gen2 capacity. In the case of our P1 license, we only have 4 front-end and 4 back-end cores.

    We are frequently spiking our CPU capacity at times we least expect. Our front-end cores, for example, are frequently the source of temporary spikes to 100%. It is confusing and difficult to troubleshoot in real-time. Worse yet, it is hard to conceptualize how much of a CPU workload this represents, since we know almost nothing about the nature of those gen2 cores.

    Is there any way to determine the nature of the cores used in our P1 capacity? Are they using some old 2.xGhz processors or modern 4.xGHz ones?

    In the case of various other types of Azure compute (IaaS), we can easily discover what we are paying for. Eg:
    https://docs.microsoft.com/en-us/azure/virtual-machines/sizes-general

    It says something about Power BI customers, that they don’t seem to want to know anything about the CPU hardware which they are being billed for. IMHO, The CPU characteristics of our P1 capacity should be displayed prominently (eg. in the capacity metrics app.)

    1. No, there isn’t a way to determine the nature of the cores. At the end of the day you should think of CPU time as a more abstract concept, describing how much work a query or a refresh does, which is relative to the overall amount of CPU your gen2 capacity allows you.

      1. I don’t see how this makes any sense. Any discussion about CPU should start with this information. I agree that everyone has an abstract concept of what a CPU is. However, If they aren’t going to tell us what type of cores they are, why do they even bother to tell us we have four of them? 😉

        Based on how quickly the CPU spikes to 100% for a few concurrently rendering reports, I have the sense the cores are comparable to ~2.6 GHz Xeon processors from ten years ago.

        Again, this ambiguity would not be well-accepted in other parts of Azure…. And I think Microsoft Power BI platform benefits when customers don’t really understand what they are paying for. I suspect a lot of customers are willing to knee-jerk into a decision to turn on “autoscale” when they receive their first capacity alert from the service.

      2. We say you have 4 cores for a P1 so you can compare it with 8 cores in a P2, 16 in a P3 and so on. It’s all relative.

        It’s also worth remembering that Premium gen2 (unlike gen1) is nothing like a virtual machine. We aren’t really giving you a box with a set number of cores, we are giving you the ability to use a set amount of CPU while the datasets and other workloads are spread across an infrastructure we manage.

      3. I understand that it isn’t a VM.

        But the term “CPU core” is one that people might reasonable believe they can relate to. (If it isn’t intended to be understood in the same sense as a VM, then you should use another term like PBI-processing-units-per-environment-per-hour – aka PBIPUPEPHR).

        Can you at least confirm if the relative performance of the hardware used for “backend cores” is the same as the “frontend cores”?

        In any case, your blog is helpful. Thanks a lot for working on these. This one gives us a rough approach for benchmarking CPU workloads within the service. I suppose if we wanted to, we could run the same workload locally on a 2.6 GHz cpu and see if the processing power of the service is in the same ballpark. Maybe one day someone will create a robust benchmarking tool for measuring PQ performance (and rendering performance). Since both those functions are possible in the desktop, it shouldn’t be too much trouble to pin down the approximate/analogous specifications of the gen2 “cores”.

      4. Frontend cores are a completely separate part of the infrastructure and not something you have any visibility of. I agree that using the term “core” is causing some confusion in this conversation, but in general I find it’s easier to explain to people than say “QPU” is with AAS.

      5. Hi Chris, I can see why the frontend cores aren’t relevant to your blog, but they certainly are a prominent/visible component of the service.

        Almost all of our active/ongoing capacity alerts from PBI are related to front-end cores. The alerts are based on usage patterns of the reports themselves (rather than scheduled dataset operations). For example, if a flock of kiosks are all rebooted at the same time each night, and they all render a CPU-intensive report in Power BI at the same time, that will cause a big spike in frontend CPU.

        … This spike can be monitored to a degree in the “capacity metrics” app, where we see of telemetry for both front-end and back-end cores. Fixing the problem is more of an art than a science. And unfortunately none of this will ever be scientific if a “P1 core” doesn’t even have a proper definition.

      6. No, you’re wrong – none of your alerts are related to front-end cores. The only usage that counts towards the limits on Premium gen2 is usage on back-end cores, and when your reports run it’s the DAX queries that are causing the spike on the back-end cores. The Premium Capacity Metrics app only shows activity on back-end cores.

      7. I appear to be wrong about the CPU that is reported in the capacity metrics app. I’ve had many conversations about this topic with “unified” Microsoft support engineers, and with consulting partners as well (Bluegranite). The conversations center around the charts – “overloading” and “cpu” . Those charts make a distinction between “Background %” “Interactive %”. These were always being explained as the back-end and front-end cores, respectively.

        However, according to the docs (“Using the Premium Gen2 metrics app”) it states things differently. It is as you said:

        “Interactive operations cover a wide range of resources triggered by Power BI users. These operations are associated with interactive page loads and are handled by *backend* cores.”

        Admittedly I always found the capacity app to be confusing and a bit unintuitive. But based on the fact that the premium licensing refers to both front-end and back-end cores, it is reasonable for your customers to try and find some representation of the *front-end* core usage in that dashboard. It isn’t surprising that so many of us believe the “interactive” CPU is a reference to the front-end cores. Maybe there should be a separate tab in the capacity app that is titled “front-end cores”, and simply has a message : “no data is provided for monitoring front-end cores; all of the interactive CPU alerts are related to back end cores”.

        Thanks again for clearing this up. The “cpu” usage in the PBI service is pretty opaque, and can be confusing. One day I think we will need a full implementation of task manager, that gives us real-time cpu and memory. 😉

Leave a Reply to Chris WebbCancel reply