In the first part of this series I showed how to create a Power BI DirectQuery dataset connected to Log Analytics to do near real-time monitoring of Analysis Services activity; in the second part I showed how to use this technique to monitor dataset refreshes in a workspace. In this blog post I’ll discuss how you can use this technique to monitor queries on your datasets.
As with the post on refreshes, the first and most important question to ask is: why do you need a DirectQuery dataset to monitor query activity? In all Power BI projects Import mode should be used where possible because it gives you better performance and more functionality. If you want to do long-term analysis of query activity then you’re better off importing the data from Log Analytics rather than using DirectQuery. So when and why would you use DirectQuery to monitory query activity in Power BI? The answer has to be when you need to be able to see what queries have been run recently, in the last hour or two, so when someone calls up to complain that their queries are slow you can see what queries they have just run and try and work out why they are slow.
It’s actually very easy to build a simple KQL query to look at query activity on your datasets: you just need to look at the QueryEnd event (or operation, as its called in Log Analytics), which is fired when a query finishes running. This event gives you all the information you need: the type of query (DAX or MDX), the duration, the CPU time, the query text and so on. The main challenge is that while you also get the IDs of the report and visual that generated the query, you don’t get the names of the report or visual. I wrote about how to get a list of visual and report IDs here and here, but how can you use that information?
Here’s an example KQL query that does just that:
let Reports = datatable (ReportId:string, ReportName:string) [ "Insert Report ID here", "My Test Report" ]; let Visuals= externaldata(VisualId: string,VisualTitle: string,VisualType: string) [ "Insert Shared Access Signature To CSV File Here" ] with (format="csv", ignoreFirstRecord=true); PowerBIDatasetsWorkspace | where TimeGenerated > ago(2h) | where OperationName == 'QueryEnd' | extend a = todynamic(ApplicationContext)| extend VisualId = tostring(a.Sources.VisualId), ReportId = tostring(a.Sources.ReportId), Operation = coalesce(tostring(a.Sources.Operation), "Query"), DatasetName = ArtifactName, Query = EventText | lookup kind=leftouter Visuals on VisualId | lookup kind=leftouter Reports on ReportId | project ExecutingUser, TimeGenerated, WorkspaceName, DatasetName, Operation, OperationDetailName, VisualTitle, VisualType, Status, DurationMs, CpuTimeMs, Query | order by TimeGenerated desc
To extract the report and visual IDs from the JSON in the ApplicationContext column I’ve used the todynamic() function and then split them out into separate columns. Then, in order to have tables to do the lookups, I used two different techniques. I used the datatable operator to create a hard-coded table with the report IDs and names; in this case I’ve only got one report, but even if you were doing this in the real world at the workspace level I think it’s feasible to create hard-coded tables with report IDs and names in because there wouldn’t be that many of them. For the visuals I saved a table with the visual IDs, names and types in a CSV file and saved it to ADLSgen2 storage, then used the externaldata() (with a shared access signature in this case just to make authentication simple) operator to read this data into a table used for the lookup.
Here’s the output:
It’s then easy to use this table in a Power BI DirectQuery dataset to build a report that shows you what queries have been run recently. This report looks a bit rubbish and I’m sure you can build something that looks a lot nicer: