Visualising Power Query Diagnostics Data In A Power BI Decomposition Tree

Recently I’ve been working with the Power Query team to come up with some ways to help developers understand the data returned by the new Power Query diagnostics functionality. In this, the first of two posts, I’m going to share a Power Query function that reformats diagnostics data in a way that makes it easy to visualise using the Power BI Decomposition Tree visual.

Here’s the code for the function:

	(Input as table)=>
	let
	Source = Table.Buffer(Input),
	RemoveColumns = Table.SelectColumns(Source,{"Id", "Query", "Step", "Category", "Operation", "Exclusive Duration", "Path"}),
	ReplaceNullsInPath = Table.ReplaceValue(RemoveColumns,null,"",Replacer.ReplaceValue,{"Path"}),
	GetNodeOperation = (pId, pPath) as text =>
	let
	FilterRows = Table.SelectRows(ReplaceNullsInPath, each [Id]=pId and [Path]=pPath),
	FirstRow = FilterRows{0},
	Op = try FirstRow[Operation] otherwise "Trace Gaps"
	in
	Op,
	CreatePath = Table.AddColumn(ReplaceNullsInPath, "Level", each
	let
	LevelNumber = List.Count(Text.Split([Path],"/")),
	CurrentPath = [Path],
	CurrentId = [Id],
	PosList = {0..LevelNumber-1},
	GenerateText = List.Transform(PosList, each CurrentId & " " & Text.BeforeDelimiter(CurrentPath, "/", _) & " " & GetNodeOperation(CurrentId, Text.BeforeDelimiter(CurrentPath, "/", _))),
	Output = Text.Combine(GenerateText, "\|")
	in
	Output
	),
	FindMaxLevels = Table.AddColumn(CreatePath, "NumberOfLevels", each 1+Text.Length( Text.Select([Level],{"\|"}))),
	MaxLevels = List.Max(FindMaxLevels[NumberOfLevels]),
	SplitLevels = Table.SplitColumn(FindMaxLevels, "Level", Splitter.SplitTextByDelimiter("\|", QuoteStyle.Csv),MaxLevels),
	RemoveMaxLevels = Table.RemoveColumns(SplitLevels,{"NumberOfLevels"}),
	CalculateDuration = Table.TransformColumns(RemoveMaxLevels,{{"Exclusive Duration", Duration.TotalSeconds, type number}})
	in
	CalculateDuration

view raw

DiagnosticsForDecompTree.m

hosted with ❤ by GitHub

Here’s how to use it:

Copy the code above into a new blank query in the Power Query Editor window, and give the query a meaningful name like DiagnosticsForDecompTree
Pass in the name of a query created by using the Query Diagnostics functionality (you’ll be able to choose any query in your pbix file using the dropdown box) and invoke the function
Load the output of the new query that gets created by invoking the function into your Power BI dataset
Add a new Decomposition Tree visual to a report page
Drag the Exclusive Duration column into the Analyze well of the visual
Drag all the columns whose names start with “Level” (ie “Level.1”, “Level.2” and so on) into the Explain By well of the visual
Add a filter or slicer on the Query column to filter the data shown in the visual down to a single query evaluation.

The output will look something like this:

DecompTree

What does this function do? The original output of a diagnostics query has one row for each event; each event may have one or more child event, and the Path column shows the details of this parent/child structure. The function takes this structure and expands it out to multiple columns, so for each event you see all of its ancestor events in a separate column on the same row. Since each event’s duration only gives you the time taken for it to run and doesn’t include the time taken its child events, when you break down an event in the Decomposition Tree you’ll see some child events with the name (Blank) – these give you the duration of just the parent event and make sure everything adds up properly. This function should work for both the detailed diagnostics output as well as the summary output.

You can download a sample pbix file showing an example of how to use it here.

This is only a first draft of the function, so I’d really appreciate it if you could try this with your own diagnostics data and check if it works properly and also if it produces useful output. Please leave your feedback in a comment either here or on the Github Gist. There are almost certainly going to be bugs!

Of course the real question is is how you interpret what you see in the Decomposition Tree and how you can use it to improve the performance of your queries. That, my friends, is a journey we’re on together, but I think that being able to visualise diagnostics data easily is an important first step to being able to understand what’s going on inside the Power Query engine when a query is evaluated.

[Thanks to Jorge Gomez Basanta, Colin Popell and Matt Masson for their help with this code]

The “Visual Has Exceeded The Available Resources” Error In Power BI

[Update 18th June 2023: this post about this error in the Power BI Service. If you’re running into this error in Power BI Desktop read this post]

Very occasionally you may find that a report that is slow to render – but still renders successfully – in Power BI Desktop, shows the error “Visual has exceeded the available resources” for some visuals when published to the Power BI Service:

Main Error

Clicking on “See details” will show a dialog that looks something like this:

SecondError

This visual has exceeded the available resources. Try filtering to decrease the amount of data displayed.

Please try again later or contact support. If you contact support, please provide these details.More details Resource Governing: The query exceeded the maximum memory allowed for queries executed in the current workload group (Requested 1048580KB, Limit 1048576KB).

The official Power BI documentation has similar advice to what’s shown in this dialog about what to do here, but what’s really going on?

The information in the “More details” section of the section dialog gives you a clue: in this case it’s resource governance. When you run a DAX query in Power BI it will always use a certain amount of memory; inefficient DAX calculations can cause a query to try to grab a lot of memory. In Power BI Desktop these queries may run successfully but be slow, but the Power BI Service can’t just let a query use as many resources as it wants (if it did, it may affect the performance of other queries being run by other users) so there is a resource governor that will kill queries that are too resource hungry. In the case of the visual above the query behind it tried to use more than 1GB of memory and was killed by the resource governor.

The limits enforced by the resource governor cannot be changed in Power BI shared capacity. The limits are higher in a Premium capacity and vary depending on the capacity size, and if you are a Premium Capacity Admin there are a number of different settings on a capacity you can change that will affect this, described here. For example the Query Memory Limit (%) setting controls the amount of memory that any given query can use before the resource governor kicks in.

To demonstrate this I moved the workspace containing the report in the screenshot above to a Power BI Embedded A4 capacity and then changed the Query Memory Limit % setting to 1.

QueryMemoryLimit

This resulted in the error dialog showing that the resource governor killed the query when it tried to use 1% of the 25GB of RAM available in an A4, which is 262144KB:

LowerLimit

While it may be useful to change this setting if you’re the admin of an overloaded Premium capacity, the real solution is to tune the DAX calculations used in this visual (I suggest you read the section on materialisation starting on p568 of “The Definitive Guide to DAX” or this older article which covers the same ground) so they use less memory and return faster or change what the visual displays.

You may see the “Visual has exceeded the available resources” error in other scenarios where the resource governor is not involved. For example, setting the Query Timeout setting on the capacity to 5 seconds like so:

TimeoutSetting

…means that the visual now times out after five seconds instead, and the “See details” dialog shows the following error instead:

Timeout

More details The XML for Analysis request timed out before it was completed. Timeout value: 5 sec.

Introduction To M In Power Query And Power BI

Since last week’s post proved very popular (it turns out people want video – who knew?) I thought I’d post the recording of a presentation I did earlier this year at the Power BI User Group in Wellington, New Zealand, called “Introduction to M in Power Query and Power BI”. It’s a session I’ve presented at various conferences and user groups over the years but this is the best version of it I think:

It’s an hour-long introduction to the M language that sits behind Power Query in Power BI, Excel, dataflows, Analysis Services and other tools. It covers pretty much everything you need to now about the language including let expressions; tables, records and lists; error handling; and writing custom M functions.

If you’re keen to watch even more Power BI-related videos, my colleague Matthew Roche has just started his own YouTube channel and published a video on dataflows vs datasets.

Why Does Power BI Query My Data Source More Than Once?

This week I was honoured to be a guest on the Guy In A Cube channel, and for my topic I decided to tackle one of life’s eternal questions: why does Power BI query my data source more than once when I refresh my dataset?

You can watch the video here:

Although I’ve never answered this question directly in a blog post before, nevertheless almost every technique I showed in the video is something I’ve blogged about so I thought it would be useful to collect all the links to these posts in one place to provide some background to what I show in the video.

The first thing to say is that this is really a Power Query question, not just a Power BI question; a lot of what I show will therefore be relevant to Excel’s Get&Transform, Power BI dataflows and every other manifestation of Power Query out there, although I can’t guarantee that all the options and behaviour will be the same in these places.

If you’re developing in Power BI Desktop and you think that refresh is taking a long time, you should definitely check whether the Power Query engine is hitting your data source more than once. There are lots of ways to do this. Some data sources have tools that show when they are queried, such as the Run History screen in Microsoft Flow that I show in the video or SQL Server Profiler. Other ways include using Fiddler for web services or Process Monitor for files.

Next, if you find that Power Query is indeed querying your data source multiple times, the most common explanation is that it’s doing a “background refresh”. This happens by default when you refresh a dataset and is Power Query refreshing all the data previews that you see when you click on a step in the Power Query Editor window. In most cases this is so quick you don’t notice it but when you have lots of queries with lots of steps it can be a gigantic problem. You can turn it off in the Options dialog in Power BI Desktop, as shown in the video, or in the Excel Query Options dialog. I blogged about this issue here.

Another possible reason is the data privacy settings you have applied. This is a gigantic, complex topic and something that I and other people have blogged about many times. I wrote a multi-part series of posts explaining data privacy settings starting here and there’s also a great explanation of what’s going on in behind the scenes written by the dev team that I link to here; I also wrote a post here showing an example of the performance impact of data privacy checks. As I say in the post, however, do not play around with these settings unless you really know what you’re doing.

If you’re ok with writing some M code, using the technique I blogged about here to implement basic query folding on a web data source can be important – in the example in the video, it was doing this that reduced the number of calls to the web service from six to three.

Some transformations can also lead to data being read from a data source more than once. For example, in this post I showed how a merge transformation leads to multiple reads from the same Excel file and how to stop this happening and drastically improve performance.

Moving on, another important lesson to learn is how referenced queries are evaluated. This is one of the most counter-intuitive things about Power Query! With the set of referenced queries used in my demo:

…where three queries called Referenced Query 1, Referenced Query 2 and Referenced Query 3 each reference a query called Call Web Service and are loaded into a dataset (and where Call Web Service is not loaded into the dataset), most people would assume that when a refresh takes place the following happens:

The Call Web Service query is run, getting the data from the web service
The data returned is then passed to Referenced Query 1, Referenced Query 2 and Referenced Query 3

This is wrong. In fact what happens is this:

Referenced Query 1 is refreshed. Because it references the Call Web Service query, that means the Call Web Service query is also evaluated.
Referenced Query 2 is refreshed. Because it references the Call Web Service query, that means the Call Web Service query is also evaluated.
Referenced Query 3 is refreshed. Because it references the Call Web Service query, that means the Call Web Service query is also evaluated.

This explains why the Call Web Service query is still being evaluated three times once the M code above has been added to it. You can read a detailed explanation of how referenced queries are evaluated here.

There is something we can do to help here though: the Power Query engine can cache the results returned from some types of data source, such as web services, but to take full advantage of this caching in this particular scenario you have to turn off the Enable Parallel Loading Of Tables option on the Options dialog as I show here. With that done, each of the three referenced queries run one after the other, instead of in parallel. When the first of these queries runs its calls the Call Web Service query, which gets data from the web service; when the next two queries run, though, and they trigger two more evaluations of Call Web Service, the data for the call to the web service has been cached. This caching can be turned off using the IsRetry option in Web.Contents as Curt Hagenlocher explains in a comment here.

The final thing shown in the video is that the infamous Table.Buffer M function does not improve performance in this case and in fact makes performance worse. If you use Table.Buffer inside the Call Web Service query it will indeed buffer the result of the call to the web service into memory, but the data that is buffered cannot be shared between separate query executions – so the buffering happens three times, once for each time Call Web Service is evaluated by the three referenced queries. Again this is something discussed in this post. Of course there are plenty of other scenarios where Table.Buffer will help performance, but these will only be in cases where the same query requests data from the same data source more than once.

Phew, that was a lot. I don’t pretend to know everything about when and why Power Query gets data from a data source multiple times but this is the sum total of my knowledge right now. I hope it’s useful!

[Update: it’s just been pointed out to me, quite rightly, that the names of my queries are confusing. For example, “Referenced Query 1” is the query doing the referencing, not the query being referenced – which is Call Web Service. I haven’t changed the names because I want the blog post to remain consistent with the video, and I hope the diagrams and the context make everything clear]

BIFocal Show Interview

Just a quick post to let you know that an interview I recorded for the BIFocal Show podcast at the Microsoft Business Applications Summit in Atlanta a few months ago is now available for your listening pleasure:

https://bifocal.show/2019/09/18/episode-117-interview-with-chris-webb/

John White and Jason Himmelstein, like less-furry versions of Paginated Report Bear, speak to all the top people in the world of Microsoft BI (they spoke to Marco Russo last week) so I highly recommend subscribing if you don’t do so already.

Paginated Report Bear And The Future Of Analysis Services Multidimensional

“Who, or what, is Paginated Report Bear?” I hear you ask. Well, he’s the breakout social media star of 2019, a furry YouTube sensation whose incisive interviews of members of the Power BI development team have become renowned for their deep technical content and insights into the Power BI roadmap. If you’re not watching his videos on YouTube, you’re missing out. Guy In A Cube is so 2018.

Paginated Report Bear

[That’s me on the left and Paginated Report Bear on the right]

Anyway, one question I am asked all the time is what the future holds for Analysis Services Multidimensional. While there is no firm news on what’s happening here, two of Paginated Report Bear’s recent interviews have discussed this topic and are particularly revealing. If you’re an SSAS MD and MDX fan I strongly recommend you watch the interviews with Josh Caplan:

…and Amir Netz:

I also had the honour to be interviewed by him at the Microsoft Business Applications Summit this week and we discussed this topic too, although I would like to stress that unlike Amir and Josh I have absolutely no influence on the decisions made in this area; in any case, my opinions on this topic might surprise some of you.

I’m Joining Microsoft

I have an announcement: tomorrow (Monday June 3rd) I’m starting a new job on the Power BI CAT team at Microsoft. It won’t affect what happens here on my blog, but I wanted to write this post because so many people have asked me why I’m making this move.

First and foremost the job at Microsoft offers some exciting new challenges for me that I wouldn’t get as a self-employed person. I’ll get to work on some of the biggest, most complex Power BI implementations in the world, provide feedback to the Power BI development team, and still be able to speak at conferences and do many of the other things I love doing now. I’ll also have the pleasure of working with a truly stellar bunch of colleagues who I know I’ll learn a lot from. And of course, what better product to work on than Power BI and what better tech company to work for nowadays than Microsoft? Power BI is going from strength to strength and I want to make a direct contribution to its future success.

What’s more the offer from Microsoft came at a time when I was getting a bit bored with the work I’ve been doing. If you do any job for long enough it gets repetitive and in my case after thirteen years (over a quarter of my life!) of running my own company I felt like I needed a change. Also, as I have made the shift from being a SSAS/MDX guy to being a Power BI guy I’ve been doing less and less technical consultancy and more and more training, mostly in the form of introductory Power BI courses. I enjoy training, I’d like to think I’m fairly good at it and it has proved very lucrative indeed – I just don’t want to be a full-time trainer, teaching the same material week after week.

Training and consultancy also involve a lot of travel. Over the last few years I’ve averaged more than ten nights per month in hotels and on top of that there were many nights when I got home late after a long journey back from a customer site. My wife has been very supportive and it’s all my kids have ever known, but it’s tiring and I want to spend more time with my family before my kids grow up and leave home. I’ve been to some interesting places on business I would never have been to otherwise and worked with some great companies, so yes, I have enjoyed myself. Business travel is nowhere near as glamorous or thrilling as it may seem, though, and I’m happy that I’ll be doing less of it. There’s also the risk that Brexit (if and when and how it ever happens) will stop me from working in Europe as easily as I have done in the past, so travelling as much might not even have been an option going forward.

Being self-employed has been a great experience and it’s something I would recommend to anyone who is thinking of doing it. I’m immensely grateful to all my customers, business partners and fellow members of the SQL and Power BI communities for making Crossjoin Consulting so successful. However it’s time for me to move on and try something new. Wish me luck! I’ll be back to blogging about Power BI, Power Query, SSAS, DAX and M next week.

Extracting All The M Code From A Power BI Dataset Using The DISCOVER_M_EXPRESSIONS DMV

DMVs (Dynamic Management Views) are, as the Analysis Services documentation states, “queries that return information about model objects, server operations, and server health”. They’re also available in Azure Analysis Service, Power BI and Power Pivot and are useful for a variety of reasons, for example for generating documentation.

Several as-yet undocumented DMVs have appeared in Power BI recently and one that caught my eye was DISCOVER_M_EXPRESSIONS. Unfortunately, when I tried to run it in DAX Studio against an open Power BI file I got an error saying it was only available in the Power BI Service:

Luckily, now that XMLA Endpoints are now in preview and SQL Server Management Studio 18 has been released (which supports connections to Power BI via XMLA Endpoints) we can test it against a published dataset stored in a Premium capacity. The following query can be run from a DAX query window in SQL Server Management Studio:

[sourcecode language=’text’ padlinenumbers=’true’]
select * from
$system.discover_m_expressions
[/sourcecode]

…returns a list of all the Power Query queries in the selected dataset and their M code:

If you don’t have Premium you can run the same query from an Excel table against any published dataset using the technique I blogged about here:

I know there are other methods for doing this (for example using copy/paste) it’s useful to be able to do this via a DMV because it means you can automate the process of extracting all your M code easily.

Some of the other new DMVs look like they are worthy of a blog post too – I can guess what most of them do from their names, but others are more mysterious and perhaps hint at features that have not been announced yet.

Listing Windows Language Code Identifiers And Their Associated Date And Number Formats With M In Power BI/Power Query

In a comment on my blog post about international date and number formats and changing data types with the “using locale” option in Power Query/Power BI, Jan Karel Pieterse asked if there was any way to get a list of the thousand and decimal separators used for number formatting by each language and region. Since this is exactly the kind of geeky question that fascinates me I decided to write an M query to answer it and – for bonus points – to find the default date format used too.

To start off, I found a table of all Windows Language Code Identifiers on this page:

https://msdn.microsoft.com/en-us/library/cc233982.aspx

Of course this can be loaded into Power Query easily using the “From Web” source. After that it’s easy to add a column to the table that takes a sample date (March 22nd 2018) and number (one hundred thousand and one tenth) and converts it to text using the language code identifier on each row: the Text.From() function does this for dates, and for numbers you have to use Number.ToText() if you want to get thousand separators and decimal separators. There are a few minor problems to deal with, such as the fact that Power Query doesn’t know what to do with dates for the “Congo Swahili” language code identifier and some rows have multiple language tags, but nothing serious.

Here’s the full code:

let
//Sample dates and numbers to show
SampleDate = #date(2018,3,22),
SampleNumber = 100000+(1/10),
//MS web page with list of language tags
LocaleWebPage =
Web.Page(
Web.Contents(
"https://msdn.microsoft.com/en-us/library/cc233982.aspx"
)
),
LocaleList = LocaleWebPage{1}[Data],
RemoveColumns =
Table.SelectColumns(
LocaleList,
{"Language", "Location (or type)", "Language tag"}
),
SplitColumn =
Table.SplitColumn(
RemoveColumns,
"Language tag",
Splitter.SplitTextByAnyDelimiter(
{",","or"},
QuoteStyle.Csv
),
{"Language tag"}
),
//Create example columns
DateExample =
Table.AddColumn(
SplitColumn,
"Date",
each Text.From(SampleDate, [Language tag])
, Text.Type),
NumberExample =
Table.AddColumn(
DateExample,
"Number",
each Number.ToText(SampleNumber,"N", [Language tag])
, Text.Type),
//Remove any rows containing errors
RemoveErrors = Table.RemoveRowsWithErrors(NumberExample)
in
RemoveErrors

Here’s some of the output:

So, if you’ve ever wondered how the Cornish speakers of south-west England like to format their dates or whether the Oromo speakers of Ethiopia use a comma or a full stop as a decimal separator, wonder no more. And if you are not interested in M at all and just want to download an Excel workbook with a list of all LCIDs and how numbers and dates are formatted for them, you can do so here.

Azure Data Studio Should Support Analysis Services And Power BI Premium Capacities

I’m at the PASS Summit this week, and in this morning’s keynote there was a demo of the newly-released Azure Data Studio – a modern, cross-platform tool for managing and querying SQL Server, Azure SQL Database and other Azure data services (it’s carefully described as “complementary to” SQL Server Management Studio rather than a replacement for it; this blog post has a detailed discussion of this question).

This video is provides a good, short overview of what it is:

I think it’s pretty cool, BUT… it doesn’t support Analysis Services. I had a moan about this and the generally poor state of Analysis Services tooling on Twitter, was invited to meet some of the developers and was told that if enough people request Analysis Services support it might happen.

What would support for Analysis Services involve? The following springs to mind:

I’d like to be able to connect to and manage Analysis Services Multidimensional and Tabular on-premises and Azure Analysis Services; if that’s too ambitious I could settle for supporting only Analysis Services Tabular 2016+ and Azure Analysis Services.
Since we will soon be able to connect to a Power BI Premium capacity as if it was an Analysis Services instance via XMLA endpoints, I would want to be able to connect to Power BI Premium capacity too.
I’d want to be able to run DAX and M queries, and ideally MDX queries too.
I would also want to be able to work with ASSL and TMSL for scripting and editing objects.
Azure Data Studio has a Profiler extension that works on xEvents; it would be great if that worked with Analysis Services xEvents too.
DAX and M Jupyter notebooks would be really useful!
It would make sense for some of the functionality of existing tools like DAX Studio and BISM Normalizer being turned into extensions.

If you want to see Analysis Services support in Azure Data Studio, go to the following issue on the Azure Data Studio GitHub repository:

https://github.com/Microsoft/azuredatastudio/issues/1026

…and click the thumbs-up icon on the first post:

AzureDataStudio

Let’s make our voices heard!

Category: Uncategorized

Visualising Power Query Diagnostics Data In A Power BI Decomposition Tree

Like this:

The “Visual Has Exceeded The Available Resources” Error In Power BI

Like this:

Introduction To M In Power Query And Power BI

Like this:

Why Does Power BI Query My Data Source More Than Once?

Like this:

BIFocal Show Interview

Like this:

Paginated Report Bear And The Future Of Analysis Services Multidimensional

Like this:

I’m Joining Microsoft

Like this:

Extracting All The M Code From A Power BI Dataset Using The DISCOVER_M_EXPRESSIONS DMV

Like this:

Listing Windows Language Code Identifiers And Their Associated Date And Number Formats With M In Power BI/Power Query

Like this:

Azure Data Studio Should Support Analysis Services And Power BI Premium Capacities

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: