Recently we announced an important new optimisation for DirectQuery datasets: the ability to run (some) of the queries generated by a single DAX query in parallel. You can read the blog post here:
A few of us on the Power BI CAT team have tested this out with customers and seen some great results, so I thought I’d write a post illustrating the effect this optimisation can have and explaining when it can and can’t help.
For the purposes of this blog post I built a DirectQuery dataset (the source and the design of the dataset are irrelevant) using the New York Taxi data. I then created a table with five measures on columns; the precise definitions of the measures don’t matter much either, except for the fact that they each generate a single Storage Engine request and horizontal fusion cannot combine these requests in any way.
Next, I published the dataset to the Power BI Service, connected to it from Tabular Editor, set the compatibility level property to 1569 and saved the change:
Having done this the Max Parallelism Per Query property became visible, and I set it to 1:
1 is not the default value for this property but it does give the same behaviour as the default at the time of writing – it ensures that the Storage Engine requests for a single DAX query are always executed one at a time.
Next I connected DAX Studio up to the XMLA Endpoint for the workspace and ran the DAX query generated by the table visual above with Server Timings enabled. The new timeline view in DAX Studio 3.0.6 (see Marco’s video for more details) does a great job of showing when the Storage Engine queries – in this case the SQL queries sent to the data source – are made:
As you can see, in this case it’s clear that the six Storage Engine queries/SQL queries generated by this one DAX query are all executed in sequence. Also notice that the duration of the DAX query was 3 seconds.
I then changed the Max Parallelism Per Query property to 6 and reran the same query from DAX Studio. Here’s what the timeline looked like:
The Storage Engine queries/SQL queries now all run in parallel and the overall duration of the DAX query is just 0.9 seconds!
This is great, but as always there are a few things that need to be highlighted. First, Power BI may not be able to run all the Storage Engine queries in parallel if there are certain types of dependency in your DAX. In the example above there were no dependencies between the measures – none of them referenced any other measures, and they all used fairly simple expressions – but in the real world that might not be the case. To illustrate this I created a new measure with the following definition:
MEASURE WithDependencies = IF ( CALCULATE ( [Distinct Medallions], ALL ( 'GEOGRAPHY'[STATE] ) ) > 0, [Trip Count], [Monday Trips] )
I then used it in a new table visual and captured the DAX query generated:
Here’s what the DAX Studio Server Timings showed for this query with Max Parallelism Per Query set to 6:
This time there is some parallelism but the first Storage Engine query/SQL query has to complete before the last two can be executed. I wrote this measure specifically to get this behaviour so you may be able to rewrite your DAX to get better parallelism, but it’s something to be aware of.
One last thing to mention: increasing parallelism here may lead to worse query performance overall if you end up running into bottlenecks elsewhere in your architecture (see here for an example). You may need to increase the “maximum number of connections per data source” property on your dataset to allow more queries to run in parallel, and if that results in additional load on the data source then you may need to scale up or scale out there too. Remember also that the DAX queries for a report page are, and always have been run in parallel so the Storage Engine queries for different DAX queries will also be run in parallel; you’ll need to watch out for report pages with a large number of visuals on them.