When you add a field to the filter pane in a Power BI report and select the “Basic filtering” filter type, in most cases you’ll see some numbers next to the field values:
These numbers are counts of the number of rows for each value in the table that the field is from. The query to get these counts is usually quite fast and inexpensive, but if you’re filtering on a field from a very large table (for example a fact table) and/or using DirectQuery mode that might not be true. For example, the screenshot above is taken from a DirectQuery dataset and here’s the SQL query that generates the counts shown:
Luckily there’s a way to disable these queries and stop the counts being displayed: set the Discourage Implicit Measures property to true on your dataset. The main purpose of this property is to stop the automatic creation of implicit measures when building Power BI reports; this makes sense when you’re using calculation groups, for example, and when you add a calculation group to your dataset then this property is set to true for you. You can also set Discourage Implicit Measures to true manually by connecting to the dataset using Tabular Editor:
Here’s what the filter pane looks like with Discourage Implicit Measures set to true:
Just to be safe, I think it makes sense to set Discourage Implicit Measures to true for all DirectQuery datasets to reduce the overall number of queries run against your data source and reduce the risk of a really expensive query being run. I don’t think seeing the counts adds much value for end users anyway.
[Thanks to Ed Hansberry for pointing out this behaviour and John Vulner for explaining it]
Last year I wrote a post about a change in the Power BI Service that meant the CPU Time associated with Power Query transformations was added to the total shown for a dataset refresh operation in Profiler traces and Log Analytics:
This was useful, but it didn’t tell you directly how much CPU Time was used by Power Query and it didn’t tell you which tables or partitions in a refresh were using the most CPU. It also didn’t tell you anything about Power Query memory usage. The good news that recently there has been another change that solves these problems.
Let’s say you have a Power BI dataset that consists of a single table whose source is the following Power Query query:
let
Source = #table(type table [MyNumber = number], List.Transform({1 .. 1000000}, each {_})),
#"Added Custom" = Table.AddColumn(
Source,
"ARandomNumber",
each Number.RandomBetween(0, 10000),
type number
),
#"Sorted Rows" = Table.Sort(#"Added Custom", {{"ARandomNumber", Order.Ascending}})
in
#"Sorted Rows"
This query creates a table with a million rows, adds a column with random numbers in and then sorts on that column – which is, as you’d expect, a very CPU and memory-hungry operation.
If you refresh this dataset in the Power BI Service and run a Profiler trace on it, looking at the Command Begin/End and Progress Report Begin/End events, this is what you’ll see:
The final Command End event shows the toal duration of the refresh as well as the amount of CPU used by both the Analysis Services engine and Power Query – in this case 24094ms.
If you look at the Progress Report End event associated with the finish of the refresh for the only partition of the only table in the dataset (highlighted in the screenshot above), there’s some extra information:
It shows the amount of CPU Time and the maximum amount of memory used by the Power Query engine while refreshing this partition. In this case the Power Query engine used 19468ms of CPU and reached a peak of 581848KB of memory. I can tell this is going to be really useful for troubleshooting refresh performance issues and out-of-memory errors.
[Thanks to Akshai Mirchandani, Xiaodong Zhang, Ting-Wei Chang and Jast Lu for this information]
Recently we announced an important new optimisation for DirectQuery datasets: the ability to run (some) of the queries generated by a single DAX query in parallel. You can read the blog post here:
A few of us on the Power BI CAT team have tested this out with customers and seen some great results, so I thought I’d write a post illustrating the effect this optimisation can have and explaining when it can and can’t help.
For the purposes of this blog post I built a DirectQuery dataset (the source and the design of the dataset are irrelevant) using the New York Taxi data. I then created a table with five measures on columns; the precise definitions of the measures don’t matter much either, except for the fact that they each generate a single Storage Engine request and horizontal fusion cannot combine these requests in any way.
Next, I published the dataset to the Power BI Service, connected to it from Tabular Editor, set the compatibility level property to 1569 and saved the change:
Having done this the Max Parallelism Per Query property became visible, and I set it to 1:
1 is not the default value for this property but it does give the same behaviour as the default at the time of writing – it ensures that the Storage Engine requests for a single DAX query are always executed one at a time.
Next I connected DAX Studio up to the XMLA Endpoint for the workspace and ran the DAX query generated by the table visual above with Server Timings enabled. The new timeline view in DAX Studio 3.0.6 (see Marco’s video for more details) does a great job of showing when the Storage Engine queries – in this case the SQL queries sent to the data source – are made:
As you can see, in this case it’s clear that the six Storage Engine queries/SQL queries generated by this one DAX query are all executed in sequence. Also notice that the duration of the DAX query was 3 seconds.
I then changed the Max Parallelism Per Query property to 6 and reran the same query from DAX Studio. Here’s what the timeline looked like:
The Storage Engine queries/SQL queries now all run in parallel and the overall duration of the DAX query is just 0.9 seconds!
This is great, but as always there are a few things that need to be highlighted. First, Power BI may not be able to run all the Storage Engine queries in parallel if there are certain types of dependency in your DAX. In the example above there were no dependencies between the measures – none of them referenced any other measures, and they all used fairly simple expressions – but in the real world that might not be the case. To illustrate this I created a new measure with the following definition:
I then used it in a new table visual and captured the DAX query generated:
Here’s what the DAX Studio Server Timings showed for this query with Max Parallelism Per Query set to 6:
This time there is some parallelism but the first Storage Engine query/SQL query has to complete before the last two can be executed. I wrote this measure specifically to get this behaviour so you may be able to rewrite your DAX to get better parallelism, but it’s something to be aware of.
One last thing to mention: increasing parallelism here may lead to worse query performance overall if you end up running into bottlenecks elsewhere in your architecture (see here for an example). You may need to increase the “maximum number of connections per data source” property on your dataset to allow more queries to run in parallel, and if that results in additional load on the data source then you may need to scale up or scale out there too. Remember also that the DAX queries for a report page are, and always have been run in parallel so the Storage Engine queries for different DAX queries will also be run in parallel; you’ll need to watch out for report pages with a large number of visuals on them.
Some good news for those of you using DirectQuery mode in Power BI: the ApproximateDistinctCount DAX function, which returns an estimate of the number of the distinct values in a column and which can be a lot faster than a true distinct count as returned by the DistinctCount function, is now available to use with BigQuery, Databricks and Snowflake sources. It only worked with Azure SQL DB and Synapse before; RedShift is coming soon. You can use it in exactly the same way that you would with the DistinctCount function except that it only works in DirectQuery mode.
For example, I have a Power BI DirectQuery dataset that uses the New York Taxi data in Snowflake as its source. With the following two DAX measures defined on the Trip table:
…I can build a table visual that compares the output of the two measures (as you can see, the difference isn’t that big):
…and see that the ApproximateDistinctCount DAX function is translated to the APPROX_COUNT_DISTINCT function in Snowflake SQL:
select { fn convert(count(distinct("MEDALLIONID")), SQL_DOUBLE) } + { fn convert(max("C1"), SQL_DOUBLE) } as "C1",
approx_count_distinct("MEDALLIONID") as "C2"
from
(
select "DATEID",
"MEDALLIONID",
"HACKNEYLICENSEID",
"PICKUPTIMEID",
"DROPOFFTIMEID",
"PICKUPGEOGRAPHYID",
"DROPOFFGEOGRAPHYID",
"PICKUPLATITUDE",
"PICKUPLONGITUDE",
"PICKUPLATLONG",
"DROPOFFLATITUDE",
"DROPOFFLONGITUDE",
"DROPOFFLATLONG",
"PASSENGERCOUNT",
"TRIPDURATIONSECONDS",
"TRIPDISTANCEMILES",
"PAYMENTTYPE",
"FAREAMOUNT",
"SURCHARGEAMOUNT",
"TAXAMOUNT",
"TIPAMOUNT",
"TOLLSAMOUNT",
"TOTALAMOUNT",
case
when "MEDALLIONID" is null
then CAST(1 as INTEGER)
else CAST(0 as INTEGER)
end as "C1"
from "NYCDATA_DB"."NYCTAXIDATA"."TRIP"
) as "ITBL"
Distinct counts are often the slowest type of measure but in my experience report consumers are very unwilling to accept seeing “near enough” numbers in their reports rather than numbers that are 100% accurate, even if the approximate distinct counts are much faster. I heard someone suggest using field parameters to allow report consumers to switch between showing fast approximate distinct counts for exploration and accurate distinct counts when they really need them, and I think this is a great compromise.
Bonus links: if you need to do an approximate distinct count in Import mode, Phil Seamark shows how to do this here; Phil also wrote a great post on building aggregations for distinct counts (read it carefully – this is a really powerful technique!) here.
In the blog post for the October 2022 release of Power BI Desktop there’s a brief description of a new optimisation in Power BI for filters that perform searches in text columns. In this blog post I’m going to share more details about this optimisation and how you can make sure your reports can benefit from it.
First of all, a brief description of the problem. Let’s say you have a Power BI dataset containing just the following table:
Let’s also say you want to build a report where a user can search for certain terms in the Description field and display a filtered table in a report:
Note how the Description field has been dragged into the Filter pane, the filter type is set to “Advanced filtering”, the “Show items when the value” dropdown has “contains” selected and the search term is “citrus”, so the table on the left only shows the fruit where the description includes the text “citrus”. Some custom visuals such as the Text Filter visual have very similar functionality.
Here’s the DAX query generated for the table visual in this screenshot:
DEFINE
VAR __DS0FilterTable =
FILTER(
KEEPFILTERS(VALUES('Fruit'[Description])),
SEARCH("citrus", 'Fruit'[Description], 1, 0) >= 1
)
VAR __DS0Core =
CALCULATETABLE(
SUMMARIZE('Fruit', 'Fruit'[Fruit Name], 'Fruit'[Description]),
KEEPFILTERS(__DS0FilterTable)
)
VAR __DS0PrimaryWindowed =
TOPN(501, __DS0Core, 'Fruit'[Fruit Name], 1, 'Fruit'[Description], 1)
EVALUATE
__DS0PrimaryWindowed
ORDER BY
'Fruit'[Fruit Name], 'Fruit'[Description]
As you can see, the filter in this query is accomplished using the DAX Search() function. This is a great example of the type of query that the optimisation I’m talking about can speed up.
Here are some more details about how this optimisation works:
This optimisation is now enabled and works in the same way in both Power BI Desktop and the Power BI Service.
The first time that any query or measure that uses either the Search() or ContainsString() DAX functions on a text column is evaluated, Power BI starts to build a special text index just for that column.
This index build will only be successful if two conditions are true:
The text column must only contain characters from the classic 128 character ASCII set.
The index build must take less than 25 seconds. If 25 seconds elapse then the build will timeout and Power BI will continue to run the query without the index being present.
If the index build succeeds for that column then the index can be used by all subsequent queries by all users, but it will be dropped when:
Power BI Desktop is restarted, if you’re in Power BI Desktop.
The dataset is refreshed, either in Power BI Desktop or the Power BI Service.
The dataset is evicted from memory in the Power BI Service or when the dataset is under memory pressure.
DAX queries that use the index will be a lot faster than queries that do not, although the difference will only be noticeable when you are searching on a table with thousands of rows and in a column with reasonably long text values.
There is no way for you to know whether an index has been built or whether the build has failed, or if a DAX query uses an index, unfortunately. However if you look at the duration of the DAX queries that do this kind of search (for example in Log Analytics or by running a Profiler trace) and you see the first query after a refresh is relatively slow and subsequent queries are almost instant then it’s likely that the index has been built successfully; on the other hand if your queries are consistently slow then it’s likely the index has not been built successfully. Before you all leave comments complaining, I know this is not ideal and I hope we’ll be able to make further improvements in the future.
How can you ensure that the index is built successfully? The only way to ensure that you stay under the 25 second timeout limit is to reduce the amount of text that needs to be indexed, either by reducing the number of rows in the table or by reducing the amount of text in the column. Reducing the amount of text in the column is probably the only feasible option: for example you may be able to remove words like “and” and “the” from your text since users are less likely to need to search for them. Making sure your text column only contains ASCII characters is much harder because you can’t just remove all non-ASCII characters (such as characters with diacritics) without making your text unsearchable; I have a good-enough solution in the form of an M custom function which removes diacritics from characters in text and ensures that only ASCII characters are left here.
[Thanks to Jeffrey Wang for the information in this post]
Do you have a Power BI report with a table or a matrix on it, where there is a filter on the rows of the table? It’s a very common scenario indeed. Is the table or matrix slow to render? If so, this post is for you!
Consider the following table visual:
There are four columns: Date, Town and two measures. One measure called [Fast Measure] is, as the name suggests, very quick to execute; the other measure, called [Slow Measure], is very complex and slow. The definitions are irrelevant here. Notice that there is a filter on this table visual so only the rows where [Fast Measure] is greater than 1 are shown.
If I measure the amount of time to render this table in Performance Analyzer, it takes around 17.5 seconds to run. However, if I remove the filter on [Fast Measure], the table only takes 8 seconds to run. Why? The filter is on the fast measure and surely more rows are returned without the filter, so wouldn’t the slow measure be evaluated more?
It turns out that the DAX generated for tables with filters applied isn’t as well-optimised as it could be. Yes, I work on the Power BI team and yes this is something that should be addressed, but before that happens I thought it would be good to make as many people aware of this as possible so they can tune their reports. I have worked with a lot of customers recently who are running into this problem without realising it.
Here’s the DAX query for the table without the filter:
DEFINE
VAR __DS0Core =
SUMMARIZECOLUMNS(
'Date'[Date],
'Property Transactions'[Town],
"Fast_Measure", 'Property Transactions'[Fast Measure],
"Slow_Measure", 'Property Transactions'[Slow Measure]
)
VAR __DS0PrimaryWindowed =
TOPN(501, __DS0Core, 'Date'[Date], 1, 'Property Transactions'[Town], 1)
EVALUATE
__DS0PrimaryWindowed
ORDER BY
'Date'[Date], 'Property Transactions'[Town]
Here’s the DAX query for the table with the filter:
Notice how, in this second query, the filter is applied in an additional variable called __ValueFilterDM0. The side effect of implementing the filter in this way is that all the measures in the table, including [Slow Measure], are evaluated twice. This explains why, in this example, the filtered table is twice as slow as the unfiltered table. The performance of your table may be different: it depends on the number of measures, their definition, the nature of the filter and many other factors. Measuring the performance of your table with and without the filter applied will tell you how much of an impact this issue is having in your report.
What can you do about this? One solution is to apply the filter inside the measures rather than on the visual. In this case, creating two new measures with the following definitions:
Fast Measure Filtered =
VAR f = [Fast Measure]
RETURN IF ( f > 1, f )
Slow Measure Filtered =
IF ( [Fast Measure] > 1, [Slow Measure] )
..and then using these measures on columns instead of the original measures, removing the visual-level filter, like so:
In my example this results in the table rendering in 8 seconds again, but again your mileage may vary. What’s more I’ve turned off totals in the table and therefore avoided the problem of making these measures return the same totals as the totals in the original filtered table. There may be other solutions (I suspect calculation groups would be worth investigating) that work better in your dataset and report.
Are you affected by this problem? How much impact does it have? Do you have a different solution? Let me know in the comments…
Do you have a a large dataset in Power BI Premium or Premium Per User? Do you have more than six tables that take a significant amount of time to refresh? If so, you may be able to speed up the performance of your dataset’s refresh by increasing the number of tables that are refreshed in parallel, using a feature that was released in August 2022 but which you may have missed.
Some of you may be thinking: haven’t you blogged about refresh parallelism before? Yes: last year I wrote a post about setting the maxParallelism property when refreshing through TMSL scripts, a few months ago I also showed how you could change the same property when refreshing using the Enhanced Refresh API. That was all fairly complex though and what I’m going to show you here is, in contrast, very easy to implement.
To illustrate this I created an Import dataset containing nine tables, each of which loaded data from different CSV files. Each table contained a couple of million rows and took 20-30 seconds to refresh. I then published to a PPU workspace in the Power BI Service and used the technique described in Phil Seamark’s “Visualise your Power BI refresh” blog post to capture what happened during refresh. The refresh took 44 seconds as as you can see from this screenshot of Phil’s report, only six of the tables were refreshed in parallel at any one time:
[Ignore the y axis on this graph – I have no idea what’s going on with it]
This is as you would expect: the default number of objects that can be refreshed in parallel in Premium or Premium Per User is 6. This is also the maximum number of objects that can be refreshed in parallel in Shared capacity.
I then went to my original .pbix file, opened the Options dialog, went to the Current File/Data Load pane and changed the new “Parallel loading of tables” setting on my dataset from Default to Custom and entered the value 9:
This setting sets the maxParallelism property in the refresh commands generated by Power BI in the Power BI Service; it is fully documented here. Rerunning the refresh again in the Power BI Service I found that the overall time taken was down to 29 seconds and that all the tables were refreshed in parallel:
Here’s the Refresh command from the Profiler trace showing maxParallelism set to 9:
In summary: refresh performance got1/3 faster by making this simple change, so if you’re using Power BI Premium and not already doing advanced things with TMSL scripts or the Enhanced Refresh API, this is definitely something you should experiment with. As I said last year you can try setting this setting up to a value of 30 but there’s no guarantee you’ll get the amount of parallelism you ask for (it depends on the Premium capacity SKU and a few other factors). What’s more, setting this property too high could result in slower refresh because it might result in Power BI overloading your data source with queries.
[Thanks to Akshai Mirchandani for answering my questions about this subject]
A few weeks ago I wrote a post showing how you can use the new EvaluateAndLog DAX function to diagnose performance problems relating to the use of the Switch function. Did you know that calculation groups can experience similar performance problems though? In some scenarios limited evaluation takes place for all calculation items, not just the one you expect; luckily you can use EvaluateAndLog to diagnose this too. In this post I’ll show you a simple example.
Let’s say you have a dataset containing the following table, called Sales:
There’s also a calculation group with three calculation items:
The Measure Value calculation item returns the value of SelectedMeasure but also uses EvaluateAndLog:
EVALUATEANDLOG( SELECTEDMEASURE() )
The Measure Value * 2 and Measure Value * 3 calculation items do the same thing but multiply the value returned by SelectedMeasure by two and three respectively, for example:
EVALUATEANDLOG( SELECTEDMEASURE() * 2 )
There’s another table in the dataset, with no relationship to the Sales table, called Calc Item Names containing a single column listing the three calculation items’ names:
Last of all, there is a measure called Sales Amount:
Sales Amount = SUM(Sales[Sales])
…and a measure called Dynamic Sales Amount Unoptimised that takes the selection made on the Calc Item Names column of the Calc Item Names table and uses it to dynamically select the calculation item with that name and apply it to the Sales Amount measure:
All this allows you to build the following report where a slicer controls the calculation item applied in the Dynamic Sales Amount Not Optimised measure:
As the name suggests, though, there’s a problem with the Dynamic Sales Amount Not Optimised measure that EvaluateAndLog can help uncover. Using Jeffrey Wang’s DAXDebugOutput tool to capture what happens when Measure Value is selected in the slicer, you can see three DAXEvaluationLog events. The first is the one you’d expect: since Measure Value is selected in the slicer, the Measure Value calculation item is evaluated:
The other two events are for the other two, unselected calculation items though – and similar to the Switch problem, some evaluation taking place for all calculation items can cause performance problems:
How do you avoid this problem? It turns out that if you use TreatAs to apply the calculation item inside Calculate, like so:
Something I do all the time when performance tuning Power BI is use SQL Server Profiler to monitor query and refresh activity. There’s a handy external tool that lets you open up Profiler with a connection to Power BI Desktop; if you’re using Power BI Premium you can also connect Profiler up to a published dataset using the XMLA Endpoint. Profiler is a bit old-school though and likely to lead to snarky comments from SQL Server DBAs (it’s not deprecated for Analysis Services though!) who’ll say you should be using Extended Events (aka xEvents) instead. And guess what – did you know you can use Analysis Services xEvents to monitor activity in Power BI Desktop?
Doing this is quite simple. The only challenge to overcome is how to connect SQL Server Management Studio to an instance of Power BI Desktop, and the easiest way to do this is to install DAX Studio as an external tool, open it connected to Power BI Desktop, and then look in the bottom right-hand corner of the screen for the address of the instance of the Analysis Services engine inside Power BI Desktop that DAX Studio is connected to:
You can then use this address in the connection dialog of SQL Server Management Studio:
Once you’ve connected, under the Management folder in the Object Explorer pane you’ll find there’s already an xEvent session running you can look at:
You can also create a new session by following the instructions here.
All very interesting, but what’s the point of doing this? I admit that I find Profiler much quicker and easier to use but xEvents have two slight advantages over Profiler:
Frustratingly, all the timestamps in Profiler for Analysis Services events are rounded to the nearest second, which makes answering questions like “which event started or finished first?” much harder to answer. In contrast, all the timestamps for xEvents are very granular:
2. There are some events that are only available in xEvents and not in Profiler traces. The ExecutionStatistics event shown in the screenshot above is a great example: the text column for it a lot of detailed information that I don’t know how to interpret (yet) and which isn’t documented but apparently can be used to determine the amount of time spent in the Formula Engine among other things. This is useful because the widely used method of doing this – add up the amount of time spent in the Storage Engine and subtract from the overall query duration – is unreliable and will become even more unreliable in the future.
I love this kind of detailed information and I’m sure there are a lot of practical problems it can be used to solve. The bad news is that you can’t use this technique against Power BI Premium, at least at the time of writing. Look out for more blog posts on this subject in the future!
Over the years I’ve seen a few examples of how issues with an organisation’s corporate network can affect Power BI report performance. I’ve never blogged about them because, to be honest, I know next to nothing about networks and I’m not sure I could describe them properly. However, recently, I have seen a few instances of what I think could be a widespread but little-diagnosed problem – so I would like you to check if it’s happening to you and tell me what the cause is, if you can find out.
Some time ago I wrote a post about how Power BI report performance could be worse in Internet Explorer 11 and older Windows operating systems because some report performance optimisations we do are only available with HTTP/2. You can read it here:
I hope you’re not still on Windows 8.1 or using IE11. Some traffic could still be being forced to use HTTP/1.1 though and this could have a big impact on report performance – in some cases making reports 5-10 seconds slower to render. How can you tell? You will need to use the Developer Tools available in Chrome and Edge. I blogged about how to use the Network tab in Developer Tools to measure Power BI report performance here:
You will also need to make sure the Protocol column is visible by right-clicking in the central pane where the traffic is displayed:
Here’s an example of what you’ll see in the Network tab when the report renders and you have the problem:
There are two things to look out for:
In the Protocol column you’ll see the value “http/1.1” rather than “h2” for some events, most importantly the query events
You’ll see a long grey bar before the green bar in the Waterfall diagram for these events showing that the connection is stalled for several seconds
If possible you should compare the performance of running your report inside your corporate network and outside it, using the technique described in the blog post mentioned above to see how long the report takes to render in both cases. If the report is faster outside your corporate network, and you only see traffic being forced to use http/1.1 and connections stalled inside your corporate network, then you have a problem and need to talk to your networking team to find out what’s going on. I’ve heard ZScaler could be the culprit but I’m not sure, so if you do have this problem and do find out the cause and the solution, please leave a comment below.