Calculate(), DAX Fusion And Filters On 0 In Power BI

Here’s a fun DAX performance tip that I found this week. Do you have measures that use Calculate() with a filter on a numeric column? Is one of the filters on the value 0? If so then this may affect you.

As always, a simple example is the best way of explaining the problem and the solution. Consider the following table in an Import mode semantic model:

Here are some measures that reference that table:

'Sales Amount' = SUM('Sales'[SalesAmount])
'Apples Sales' = CALCULATE([Sales Amount], Sales[ProductID]=0)
'Oranges Sales' = CALCULATE([Sales Amount], Sales[ProductID]=1)
'Pears Sales' = CALCULATE([Sales Amount], Sales[ProductID]=2)
'Grapes Sales' = CALCULATE([Sales Amount], Sales[ProductID]=3)

Let’s say you then have a report which shows the value of the ‘Oranges Sales’ measure:

Running the DAX query generated by this table in DAX Studio with Server Timings enabled shows that there is just one Storage Engine query generated by this DAX query:

Here’s the xmSQL for that single SE query:

SET DC_KIND="AUTO";
SELECT
    SUM ( 'Sales'[SalesAmount] )
FROM 'Sales'
WHERE
    'Sales'[ProductID] = 1;

The WHERE clause here filters on the ProductID 1, which is the ID of the product Oranges.

Now if you add the measures ‘Pear Sales’ and ‘Grapes Sales’ to the table visual:

…and run the query in DAX Studio again, you’ll still see a single SE query:

Here’s the xmSQL:

SET DC_KIND="AUTO";
SELECT
    'Sales'[ProductID],
    SUM ( 'Sales'[SalesAmount] ),
    SUM ( 'Sales'[SalesAmount] ),
    SUM ( 'Sales'[SalesAmount] )
FROM 'Sales'
WHERE
    'Sales'[ProductID] IN ( 1, 2, 3 ) ;

As you can see, the WHERE clause now filters on the ProductIDs 1, 2 or 3: the IDs of the three products used in the three measures. This is DAX fusion – specifically horizontal fusion – in action. It’s an optimisation where multiple filters on the same column can be combined into a single SE query. Fewer SE queries is generally better for performance. So far so good.

Now let’s add the measure ‘Apples Sales’ to the table visual:

Running the DAX query in DAX Studio now shows there are two SE queries:

The first SE query has the same xmSQL as the previous DAX query:

SET DC_KIND="AUTO";
SELECT
    'Sales'[ProductID],
    SUM ( 'Sales'[SalesAmount] ),
    SUM ( 'Sales'[SalesAmount] ),
    SUM ( 'Sales'[SalesAmount] )
FROM 'Sales'
WHERE
    'Sales'[ProductID] IN ( 1, 2, 3 ) ;

The second SE query has the following xmSQL and a WHERE clause that indicates it is retrieving the data for just Apples:

SET DC_KIND="AUTO";
SELECT
    SUM ( 'Sales'[SalesAmount] )
FROM 'Sales'
WHERE
    'Sales'[ProductID] IN ( null, 0 ) ;

Two SE queries can mean worse overall performance. Why is the filter on 0 in the ‘Apples Sales’ measure special? Why does it result in a second SE query, why does this second SE query filter on 0 or null, and why doesn’t horizontal fusion take place for Apples?

The answer lies with how DAX handles blanks and zeroes, something discussed in depth in this article by Marco Russo. The filter condition in the ‘Apples Sales’ measure:

'Apples Sales' = CALCULATE([Sales Amount], Sales[ProductID]=0)

..actually filters on 0 or blank and that’s why the xmSQL of that second SE query filters on 0 or null, and that in turn explains why horizontal fusion does not take place – all the other measures filter on a specific number, the ‘Apples Sales’ measure filters on the number 0 or blank.

The solution is to update the measures in the model to use the strictly equal to == operator like so:

'Sales Amount' = SUM('Sales'[SalesAmount])
'Apples Sales' = CALCULATE([Sales Amount], Sales[ProductID]==0)
'Oranges Sales' = CALCULATE([Sales Amount], Sales[ProductID]==1)
'Pears Sales' = CALCULATE([Sales Amount], Sales[ProductID]==2)
'Grapes Sales' = CALCULATE([Sales Amount], Sales[ProductID]==3)

After this change the DAX query that returns the measures for ‘Apples Sales’, ‘Oranges Sales’, ‘Pear Sales’ and ‘Grapes Sales’ now generates a single SE query, meaning that horizontal fusion is taking place for all measures:

Here’s the xmSQL for that query:

SET DC_KIND="AUTO";
SELECT
    'Sales'[ProductID],
    SUM ( 'Sales'[SalesAmount] ),
    SUM ( 'Sales'[SalesAmount] ),
    SUM ( 'Sales'[SalesAmount] ),
    SUM ( 'Sales'[SalesAmount] )
FROM 'Sales'
WHERE
    'Sales'[ProductID] IN ( 0, 1, 2, 3 ) ;

As you can see, the WHERE clause now filters on the Product IDs 0, 1, 2 or 3.

This example uses an Import mode model but this tip also applies to DirectQuery mode and because additional SE queries (which mean additional SQL queries) can have more of an impact on performance in DirectQuery mode then ensuring horizontal fusion takes place can be even more important in DirectQuery mode.

I think this tip could benefit a lot of semantic models out there. A lot of measures use Calculate() and filter on numeric columns: the customer I was working with this week had measures that filtered on a year offset column on their date dimension table (so filtering on offset 0 meant this year, filtering on offset 1 meant last year and so on) and I reproduced the problem on my sample DirectQuery semantic model based on the ContosoDW sample database with a filter on the NumberChildrenAtHome column of the Customer dimension table. Also, I can’t remember the last time I saw the strictly equal to operator in the wild. If you find this tip helps you, please let me know by leaving a comment!

One thought on “Calculate(), DAX Fusion And Filters On 0 In Power BI

  1. Why wouldn’t we always use == for everything? Regardless of integer or text. Is there any downside to this?

Leave a Reply