I’ll leave it to other people to get worked up about things like Power BI Premium gen2 – what really gets me excited is a new Profiler event type. And indeed there is a new one to play with – Job Graph – that you can use to visualise Azure Analysis Services processing jobs (it’s not enabled yet for the Analysis Services engine in Power BI).
More details about it, and how it can be used, are in the samples here:
The data returned by the Job Graph event isn’t intelligible if you look at the text it returns in Profiler. However if you save a .trc file with Job Graph event data to XML you can use the Python scripts in the GitHub repo to generate DGML diagrams that can be viewed in Visual Studio, plus Gantt charts embedded in HTML. Of course to do this you’ll need to have Python installed; you’ll also need to have Visual Studio and its DGML editor installed (see here for details).
Here’s the DGML diagram for the refresh of a very simple AAS database with just one table in it, called People:
The eight nodes in a semi-circle around the top of the diagram represent the eight jobs that process the attribute hierarchies in the table. These eight jobs must be completed before the single partition in the table is processed (this is the node in the centre of the diagram) and that partition must be processed before the table itself is processed (this is the node at the bottom of the diagram).
Since this is the annotated output you can hover over a node in the diagram and see more details about it, including its duration and how long it was blocked for:
Notice how three nodes have dark backgrounds? These nodes represent the critical path. From the sample documentation:
When the job graph executes, there is always a job that finishes last before the engine can commit the change. This job that finishes last is the “critical dependency” for the commit; the entire commit needs to wait for this one job to finish before executing. This last job depends on other jobs, one of which finished after all the others. This is the next critical dependency. Tracing this path of critical dependencies forms the critical path, which helps engineers and customers identify why processing takes so long.
To diagnose slow refresh times, look for the critical path and start at the top. Look at “Blocked duration”, “Waiting duration”, and “Running duration”. If a job has a long blocked duration, it spent a long time waiting on other jobs. If a job has a long waiting duration, it was waiting for an available thread, so increasing maxParallelism could help. If a job has a long running duration, then the model might have to be changed to speed up the job.
Here’s another diagram, generated from a slightly more complex database with four tables, a calculated column and a calculated table:
There’s also a script that takes this data and generates a Gantt chart from it, but at the time of writing it seems to be a bit buggy; I can make it work if I comment out some code but I suspect that means the output isn’t accurate. Hopefully it will get fixed soon, but in the meantime here’s an example of what it produces:
I think this is going to be really useful when I’m troubleshooting processing performance problems. It would also be really cool to have this in DAX Studio (hint, hint)…