Chris Webb's BI Blog: Reading The Power Query Trace File

In the September 2014 update of Power Query a new piece of functionality was added: the ability to turn on tracing. There’s no documentation about what the trace files actually contain anywhere (it’s clearly intended to be used only by Microsoft to help them diagnose problems) but I couldn’t help but be curious about what’s in there. And of course, when faced with a strange text file to make sense of, I turned to Power Query!

First of all, to turn on tracing, click on the Options button on the Power Query tab in the Excel ribbon, then check the ‘Enable Power Query tracing’ box:

Clicking on the ‘Open traces folder’ link will take you to the directory where the trace files are stored, which in my case is:

C:\Users\Chris\AppData\Local\Microsoft\Power Query\Traces

You can then run some queries and you’ll see trace log files appear in that folder:

[This is where the speculation starts]

As far as I can see, every time you run a Power Query query two files are created: one with a filename beginning “excel”, the other with a filename beginning “Microsoft.Mashup.Container”. All of the interesting things I found were in the “Microsoft.Mashup.Container” files so I’ll ignore the second type of file from now on.

The format of these files is pretty clear from this screenshot:

Each line in the file starts with “DataMashup.Trace.Information”, then there’s a number (which seems to be the same in all cases) and then there’s a JSON fragment. There are two things to point out before you attack this file in Power Query thought:

The obvious way to get rid of everything before the JSON is to split the column twice using a colon as the delimiter. However if you do this using the default UI settings you’ll find that the JSON is mysteriously broken – all the double quotes have disappeared. This is in fact a side-effect of the way the UI uses the Splitter.SplitTextByEachDelimiter() function: it uses a null in the second parameter, which translates to the default value of QuoteStyle.Csv, but to stop the JSON breaking you need to change this to QuoteStyle.None.
When you have got rid of everything but the JSON, you just need to click Parse/JSON and you can explore the data in there:

Here’s an example of a query I generated to read a log file. I don’t want you to think this is a query that will read every log file though: the format may vary depending on the query you run or the version of Power Query you have. I also encountered a few bugs and strange error messages in Power Query while experimenting (I think some were caused by trying to read from files while tracing was turned on, or while the files were open in Notepad) so I can’t guarantee you’ll be able to read the files you’re interested in.

[sourcecode language=”text” padlinenumbers=”true”]
let
Source = Table.FromColumns({
Lines.FromBinary(
File.Contents(
"Insert full path and file name of log file here!")
,null,null,1252)}),
#"Split Column by Delimiter" = Table.SplitColumn(
Source,
"Column1",
Splitter.SplitTextByEachDelimiter(
{":"},
QuoteStyle.None,
false),
{"Column1.1", "Column1.2"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",
{{"Column1.1", type text}, {"Column1.2", type text}}),
#"Split Column by Delimiter1" = Table.SplitColumn(
#"Changed Type",
"Column1.2",
Splitter.SplitTextByEachDelimiter(
{":"},
QuoteStyle.None,
false),
{"Column1.2.1", "Column1.2.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",
{{"Column1.2.1", Int64.Type}, {"Column1.2.2", type text}}),
#"Parsed JSON" = Table.TransformColumns(#"Changed Type1",
{{"Column1.2.2", Json.Document}}),
#"Removed Columns" = Table.RemoveColumns(#"Parsed JSON",
{"Column1.1", "Column1.2.1"}),
#"Expand Column1.2.2" = Table.ExpandRecordColumn(
#"Removed Columns",
"Column1.2.2",
{"Start", "Action", "Duration", "Exception",
"CommandText", "ResponseFieldCount"},
{"Start", "Action", "Duration", "Exception",
"CommandText", "ResponseFieldCount"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Expand Column1.2.2",
{{"Start", type datetime}, {"Duration", type duration}})
in
#"Changed Type2"
[/sourcecode]

At this point there’s plenty of useful information available: a list of events, a start time for each event, a duration for each event, and other columns where error messages, durations, SQL queries and the number of fields returned by each SQL query. All very useful information when you’re trying to work out why your Power Query query is slow or why it’s not working properly. It’s a shame there isn’t any documentation on what’s in the trace file but as I said, it’s not really for our benefit so I can understand why.