An M Function To Help You Explore Power Query Diagnostics Data

Last week’s post showed an M function that took Power Query diagnostics data and formatted in a way that made it suitable for visualisation in a Power BI Decomposition Tree visual. This is great for understanding what’s going on at a high level, but by doing this you also lose a lot of detailed information from the diagnostics logs that could be useful for performance tuning. This week I have an M function for you that takes a different approach to solving the same problem that might be more appealing for the hardcore Power Query fans out there.

Here’s the code for the function:

(Input as table) as table =>
Source = Input,
#"Expanded Additional Info" = Table.ExpandRecordColumn(Source, "Additional Info", {"Message"}, {"Message"}),
#"Calculated Total Seconds" = Table.TransformColumns(#"Expanded Additional Info",{{"Exclusive Duration", Duration.TotalSeconds, type number}}),
#"Sorted Rows" = Table.Sort(#"Calculated Total Seconds",{{"Id", Order.Ascending},{"Start Time", Order.Ascending}}),
#"Removed Other Columns" = Table.SelectColumns(#"Sorted Rows",{"Id", "Query", "Category", "Operation", "Start Time", "End Time", "Exclusive Duration (%)", "Exclusive Duration", "Data Source Query", "Message", "Row Count", "Content Length", "Path", "Group Id"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Message", type text}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type",null,"Missing",Replacer.ReplaceValue,{"Path"}),
BufferedTable = Table.Buffer(#"Replaced Value"),
GetAllChildRows = (CurrentId, CurrentPath) =>
Table.SelectRows(BufferedTable, each [Path]<>"Missing" and [Id]=CurrentId and Text.StartsWith([Path], CurrentPath)),
AddTotalED = Table.AddColumn(#"Replaced Value", "Exclusive Duration (Including Child Operations)", each List.Sum(GetAllChildRows([Id],[Path])[Exclusive Duration]), type number),
AddTotalEDPct = Table.AddColumn(AddTotalED, "Exclusive Duration (%) (Including Child Operations)", each List.Sum(GetAllChildRows([Id],[Path])[#"Exclusive Duration (%)"]), Percentage.Type),
#"Inserted Text Before Delimiter" = Table.AddColumn(AddTotalEDPct, "Parent Path", each Text.BeforeDelimiter([Path], "/", {0, RelativePosition.FromEnd}), type text),
#"Added Custom" = Table.AddColumn(#"Inserted Text Before Delimiter", "Child Operations", each
CurrentPath = [Path],
CurrentId = [Id],
ChildRows =
@#"Added Custom",
[Path]<>"Missing" and [Parent Path]=CurrentPath and [Id] = CurrentId
Output = if Table.RowCount(ChildRows)=0 then null else ChildRows
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Path] = "0" or [Path]="Missing")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Parent Path"})
#"Removed Columns"

As before you need to pass a query created by the Power Query diagnostics functionality to the function:


[Note: I recommend calling the function on the “detailed” diagnostics query and not the summarised view, because I’ve found what might be a bug with the summarised view]

The function will then return a table with one row for each query that is evaluated while the diagnostics were running:


[Note: You may find that some query evaluations have nulls instead of the name of the query; that’s an issue with the diagnostics data that will be addressed soon]

You could get the same thing simply by filtering the raw diagnostics data down to the rows where the Path column contains 0, but I’ve added two extra things that make the data much easier to understand.

First of all, the original data contains columns called “Exclusive Duration” and “Exclusive Duration (%)”, which tell you the amount of time each operation took in seconds and what percentage this was of the entire query evaluation. However, the data itself is in a parent/child structure so while a parent operation may have a very small duration it may have many slow descendant operations. Therefore my function adds two new columns called “Exclusive Duration (Including Child Operations)” and “Exclusive Duration (%)(Including Child Operations)”. The first of these new columns returns the total duration of each operation and all its child operations, while the second returns the percentage that the duration of each operation and all its child operations make up of the whole query duration. At the root level, this means the first of these new columns shows the total duration of each query in seconds and the second of these new columns should always show 100%.


Secondly, the function adds a new column called Child Rows that contains a nested table containing the child operations associated with each operation:


You can either click on the Table link in this column to navigate to a new table containing all the child operations, or use the Expand/Aggregate button in the top-right hand corner to keep the current operations and join the nested rows onto them (I strongly recommend you do the former). Experienced Power Query developers will be familiar with the concept of nested tables and I think this provides an easy way of navigating the parent/child operation structure that is similar to what you see with a Decomposition Tree but keeps all the detail columns that the diagnostics data provides you with and means you don’t have to leave the Power Query Editor.

I would love to hear your feedback about this and know if there are any bugs or other information that could be added. You can download a sample pbix file for this function here.


2 responses

  1. @Chris

    This is great
    I just modified the Select column step to include a Column called “Step” – which is the name of the step

    This gives a table from which I can select the Step name and the Exclusive Duration including child operations to find out time each step takes

  2. Pingback: An M Function To Help You Explore Power Query Diagnostics Data |

Leave a Reply to sam Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: