In the first post in this series I showed how it was possible to create a DirectQuery dataset connected to Log Analytics so you could analyse Power BI query and refresh activity in near real-time. In this post I’ll take a closer look into how you can use this data to monitor refreshes.
The focus of this series is using DirectQuery on Log Analytics to get up-to-date information (remember there’s already an Import-mode report you can use for long-term analysis of this data), and this influences the design of the dataset and report. If you’re an administrator these are the refresh-related questions you will need answers to:
- Which datasets are refreshing right now?
- Which recent datasets refreshes succeeded and which ones failed?
- If a dataset refresh failed, why did it fail?
- How long did these refreshes take?
To answer these questions I built a simple dataset with two tables in. First of all, I built a KQL query that answers these four questions and used it as the source for a DirectQuery table in my dataset. Here’s the query:
let //get all refresh events RefreshesInLast24Hours = //this table holds all the Power BI log data for a workspace PowerBIDatasetsWorkspace //assume that no currently running refresh has run for more than 24 hours | where TimeGenerated > ago(24h) //only return the events fired when refreshes begin and end | where OperationName in ('CommandBegin', 'CommandEnd') and EventText has_any ('<Refresh xmlns', '\"refresh\"'); //gets just the events fired when the refreshes begin let CommandBegins = RefreshesInLast24Hours | where OperationName == 'CommandBegin' ; //gets just the events fired when the refreshes end let CommandEnds = RefreshesInLast24Hours | where OperationName == 'CommandEnd' ; //do a full outer join between the previous two tables CommandBegins | join kind=fullouter CommandEnds on XmlaRequestId, EventText //select just the columns relevant to refreshes | project ArtifactName, XmlaRequestId, StartTime = TimeGenerated, EndTime = TimeGenerated1, DurationMs = DurationMs1, CpuTimeMs = CpuTimeMs1, Outcome = coalesce(Status1, 'In Progress'), ErrorCode = StatusCode1, WorkspaceId, WorkspaceName | summarize StartTime=min(StartTime), EndTime=max(EndTime), DurationMs=max(DurationMs), CpuTimeMs=max(CpuTimeMs), ErrorCode=min(ErrorCode) by XmlaRequestId, ArtifactName, Outcome, WorkspaceId, WorkspaceName
Here’s an example of the output, a table with one row for each refresh in the last 24 hours:
Here’s how the query works:
- Gets all the CommandBegin and CommandEnd events – the events that are fired when a refresh begins and ends – that occurred in the last 24 hours and stores them in two tables.
- Only some of the CommandBegin and and CommandEnd events are related to refreshes, so the query filters on the EventText column to find them.
- I chose to limit the data to the last 24 hours to keep the amount of data displayed to a minimum and because it’s unlikely that any refresh will take more than 24 hours. If I limited the data to the last hour, for example, then the query would not return the CommandBegin events or the CommandEnd events for any refreshes that started more than an hour ago and were still in progress, so these refreshes would not be visible in the report.
- Does a full outer join between the table with the CommandBegin events and the table with the CommandEnd events, so the query returns one row for every that either started or ended in the last 24 hours.
- Limits the columns in the results to show:
- XmlaRequestId – the unique identifier that links all the events for a specific refresh operation
- ArtifactName – the dataset name
- WorkspaceId, WorkspaceName – the ID and name of the Power BI workspace where the dataset is stored
- StartTime and EndTime – the refresh start time and end time,
- DurationMs – the duration of the refresh in milliseconds
- CpuTimeMs – the CPU Time consumed by the refresh in milliseconds,
- Outcome – whether the refresh succeeded, failed or is still in progress
- Errorcode – the error code that tells you why the refresh failed
Next I created a second table to return more detailed information about a specific refresh. Here’s an example of the KQL:
PowerBIDatasetsWorkspace | where TimeGenerated > ago(24h) | where OperationName in ('CommandBegin', 'CommandEnd', 'ProgressReportEnd','Error') | where XmlaRequestId in ('5306b4ef-d74a-4962-9fc0-f7a299f16d61') |project OperationName, OperationDetailName, TimeGenerated, EventText, XmlaRequestId, DurationMs, CpuTimeMs
This query returns the CommandBegin, CommandEnd, ProgressReportEnd and Error events for a specific refresh (so all these events have the same XmlaRequestId value).
One last complication was that, in my Power BI dataset, I was not able to create a relationship between these two table (I got some kind of query folding error although I don’t know why), so in order to be able to drill from the summary table to the detail table I had to use a dynamic M parameter for the XmlaRequestId used to filter the second table (see this post for more details on how to implement dynamic M parameters when working with KQL queries).
Here’s the report I built on this dataset showing how you can drill from the summary page to the details page for a particular refresh:
Not exactly pretty, but I’m not much of a data visualisation person and I think tables are fine for this type data. You can download a sample .pbix file (I couldn’t make a .pbit file work for some reason…) for this report here and prettify it yourself if you want. All you need to do is change the Cluster and Database parameters in the Power Query Editor to match the details of your Log Analytics workspace and database (see my last post for more details).
[Thanks to Chris Jones at Microsoft for his ideas and suggestions on this topic]
[Update 10th January 2022 – I’ve modified the summary KQL query based on feedback from another Microsoft colleague, Xiaodong Zhang, who pointed out that it didn’t work for TMSL scripts or datasets that use incremental refresh]