Analysing Dataset Refresh In Power BI Premium Using SQL Server Profiler

A few weeks ago I showed how XMLA Endpoints allow you to connect SQL Server Profiler to Power BI Premium. As well as looking at query execution times this also means you can see in more detail what happens when a dataset is refreshed, for example so you can find out exactly how long a refresh took, understand which tables inside the dataset contribute most to refresh times or which calculated columns or calculated tables take the longest to create. Remember that refresh performance in Power BI Desktop may be different to refresh performance in the Power BI Service where you’re running on different hardware, may be going through an on-premises gateway to reach your data sources and where other operations may also be consuming resources while refresh is taking place.

To do this you need to create a trace in the way I describe in that previous post and include (at least) the following two pairs of trace events:

  • Command Begin/End
  • Progress Report Begin/End

Once the trace is running and you start a refresh, you’ll see a lot of events generated:

Trace

What does this all mean though? Generally speaking any books or blog posts that talk about processing in Analysis Services Tabular will also apply to refresh in Power BI since the two engines are very similar, although there are many options available in Analysis Services that are not (yet) available for Power BI. For example, my colleage Richard Tkachuk has just written a blog post on Analysis Services processing here that is a good introduction to the subject; chapters 11, 12 and 15 of Marco and Alberto’s book “Tabular modelling in Microsoft SQL Server Analysis Services” provide a lot more detail. If you’re interested in the performance of your Power Query/M queries this post of mine from last year is worth reading too.

It looks a lot more complex than it actually is though. The columns in the trace you should focus on are the ones shown in the screenshot above: EventClass, EventSubclass, TextData and Duration. In EventClass you’ll see pairs of Command Begin/End and Progress Report Begin/End events; EventSubclass and TextData give you more information about each event; and Duration will give you the amount of time taken for each event (it only appears for the “end” events in the pairs) in milliseconds. A refresh is represented by a single Command, so when you’re looking for the events in your trace that relate to a refresh you’ll need to look for a pair of Command Begin/End events with the word “Refresh” visible in amongst a lot of XML in the TextData column and with a lot of Progress Report Begin/End events in between:

Refresh

The Command End event of this pair will give you the time taken for the refresh in the Duration column:

CommandEnd

Finding the amount of time needed to refresh individual tables is not so straightforward because (as Richard Tkachuk describes in his blog post mentioned above) refreshing a table consists of a number of different jobs. The easiest thing to do is to look for a Progress Report End event with EventSubclass 59 where the TextData column starts with “Finished processing partition…” followed by the name of your table:

Table

The Duration column for this event will give you the amount of time taken for the majority of work needed to refresh a table.

There is, of course, a lot more that you can see in these traces and a lot more detail you can get by adding other events, but I’ll stop here for now. If you find anything interesting while you’re looking at traces of your own please let me know by leaving a comment!

6 responses

  1. Pingback: Use cases of the XMLA endpoint – justB smart

  2. Pingback: Power BI Insights: Data refresh; Parallel scripts; Moving columns; Power Query | MSDynamicsWorld.com

  3. Pingback: Power BI Insights: Data refresh; Parallel scripts; Moving columns; Power Query ERP for Hong Kong SME

  4. Hello Chris, This is much useful. i have used the Query end in trace as instructed and did lot of tuning thanks for that, now i am into making processing better and faster, and looking at this blog given me some information excluding actual query executing while to process read data from source database, can i have any other trace fields or other way to grab those queries instead? much helpful .

  5. I tried the trace on a power bi dataset in a premium workspace but this slowed down the dataset refresh from 3 minutes to 12 minutes. Now, even after stopping the refresh in the SQL profiler, subsequent refreshes are taking 12 minutes.

    Is the trace still running on the server? If so, how can we switch off the trace on the server hosting the premium workspace dataset.

    • Hmm, the trace should not still be running, but I’m pretty sure there’s a DMV query you can run to see all active traces so that will tell you. I’m surprised that running a trace had this kind of effect – are you sure nothing else is happening on the capacity? Have you tried restarting the capacity?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: