In the second post in this series I discussed a KQL query that can be used to analyse Power BI refresh throughput at the partition level. However, if you remember back to the first post in this series, it’s actually possible to get much more detailed information on throughput by looking at the ProgressReportCurrent event, which fires once for every 10000 rows read during partition refresh.
Here’s yet another mammoth KQL query that you can use to analyse the ProgressReportCurrent event data:
PowerBIDatasetsWorkspace | where TimeGenerated > ago(1d) | where OperationName == "ProgressReportCurrent" or OperationName=="ProgressReportBegin" or OperationName == "ProgressReportEnd" | where XmlaObjectPath!="" | where OperationDetailName=="ReadData" | project OperationName, XmlaObjectPath, EventText, TimeGenerated, ProgressCounter, OperationDetailName, XmlaRequestId, ObjectPathRequestId = strcat(XmlaObjectPath ,XmlaRequestId) | partition by ObjectPathRequestId ( sort by TimeGenerated asc | scan declare( PrevEventTimeGenerated:datetime, TimeSincePrevEvent:timespan=time(0s), TimeSinceRefreshBegin:timespan=time(0s), NumberOfRowsRead:int=0) with ( step RefreshStarts: OperationName=="ProgressReportBegin"; step RefreshInProgress: OperationName == "ProgressReportCurrent" => PrevEventTimeGenerated = iff(ProgressCounter==10000, RefreshStarts.TimeGenerated, RefreshInProgress.TimeGenerated), TimeSincePrevEvent = iff(ProgressCounter==10000, TimeGenerated - RefreshStarts.TimeGenerated, TimeGenerated-RefreshInProgress.TimeGenerated), TimeSinceRefreshBegin = TimeGenerated - RefreshStarts.TimeGenerated, NumberOfRowsRead = iff(ProgressCounter==10000, 10000, ProgressCounter-RefreshInProgress.ProgressCounter); step RefreshEnds: OperationName == "ProgressReportEnd" => PrevEventTimeGenerated = RefreshInProgress.TimeGenerated, TimeSincePrevEvent = TimeGenerated - RefreshInProgress.TimeGenerated, TimeSinceRefreshBegin = TimeGenerated - RefreshStarts.TimeGenerated, NumberOfRowsRead = ProgressCounter - RefreshInProgress.ProgressCounter; ) | where OperationName == "ProgressReportCurrent" or OperationName == "ProgressReportEnd" ) | project XmlaObjectPath, Table = split(XmlaObjectPath,".", 2), Partition = split(XmlaObjectPath,".", 3), OperationName, StartTime = format_datetime(PrevEventTimeGenerated,'yyyy-MM-dd HH:mm:ss.fff' ), EndTime = format_datetime(TimeGenerated,'yyyy-MM-dd HH:mm:ss.fff' ), ProgressCounter, SecondsSincePreviousEvent = TimeSincePrevEvent/1s, SecondsSinceRefreshBegin=TimeSinceRefreshBegin/1s,NumberOfRowsRead, NumberOfRowsReadPerSecond=NumberOfRowsRead/(TimeSincePrevEvent/1s), XmlaRequestId
Here’s a brief explanation of what it does:
- It filters the Log Analytics data down to get events from the last day and just the ProgressReportCurrent events, as well as the ProgressReportBegin/End events which are fired before and after the ProgressReportCurrent events.
- It then splits the data into groups of rows (‘partitions’ in KQL, but of course not the partitions that are being refreshed) by a combination of XmlaRequestId (ie the refresh operation) and XmlaObjectPath (ie the partition that is being refreshed)
- For each group of rows it will then:
- Find the ProgressReportBegin event and from this get the time when data started to be read from the source
- Get all subsequent ProgressReportCurrent events and calculate the amount of time elapsed since the previous event (which might be the ProgressReportBegin event or a previous ProgressReportCurrent event) and the number of rows read
- When the ProgressReportEnd event is encountered, calculate the amount of time elapsed since the previous ProgressReportCurrent event and the number of rows (which will be less than 10000) read since then
- Filter out the ProgressReportBegin events because we don’t need them any more
- Finally, add columns that splits out the table name and partition name and calculates the number of rows read per second for each row by dividing the number of rows read for each event by the amount of time elapsed since the previous event
What can this query tell us about throughput?
First of all, something interesting but not necessarily useful. At least for the data source I’m using for my tests, when I plot a column chart with the number of rows read on the x axis and the amount of time elapsed since the last event on the y axis (ie the amount of time it takes to read 10000 rows for all but the last column) then I noticed that every 200000 rows something happens to slow down the read:
I have no idea what this is, whether it’s a quirk of this particular source or connector, but it’s a great example of the kind of patterns that become obvious when you visualise data rather than look at a table of numbers.
Plotting time on the x axis of a line chart and the cumulative total of rows read on the y axis gives you something more useful. Here’s the chart for one of the refreshes mentioned in my last post where four partitions of the same table are refreshed in parallel:
In this case throughput is fine up until the end of the refresh at which point something happens to the February, March and April partitions but not the January partition to slow them down for about 30 seconds, after which throughput goes back to what it was before. Here’s the same chart zoomed in a bit:
Here’s the same problem shown in the first graph above, where the number of rows read is on the x axis, showing how for example with the April partition there’s a sudden spike where it takes 14 seconds to read 10000 rows rather than around 0.3 seconds:
What is this, and why isn’t the January partition affected? Maybe it was a network issue or caused by something happening in the source database? Looking at another refresh that also refreshes the same four partitions in parallel, it doesn’t seem like the same thing happens – although if you look closely at the middle of the refresh there might be a less pronounced flattening off:
Again, the point of all this is not the mysterious blips I’ve found in my data but the fact that if you take the same query and look at your refreshes, you may find something different, something more significant and something you can explain and do something about.