Solving the ‘Events in Progress’ Problem in DAX, Part Two: Aggregating Dates and Multiselect

You may recall that by the end of last week’s post on solving the ‘events in progress’ problem in DAX, I’d managed to get the right values out of PowerPivot at the Date level with good performance. BUT when, for example, I selected a Calendar Year I got no values out, and I hadn’t even thought of how to handle situations such as when Date was on filter and the user had selected three different, non-consecutive dates. Well, that was last week when I was a DAX newbie – this week, well, I’m slightly less of a newbie and I’ve got the solution.

First, let’s restate the problem a bit more clearly. What I want is this: when I have a date on columns or rows, I want my calculated measure to display the number of rows in the FactInternetSales table that have an OrderDate less than or equal to that date, and have a ShipDate that is greater than or equal to that date. This way I see the number of rows in the table where something has been ordered but not yet shipped. When I select a year, I want to see the number of rows where that OrderDate-ShipDate range intersects with the dates in that year. So if I selected the year 2002, I want all the rows where either the OrderDate is in 2002, or the ShipDate is in 2002, or when the OrderDate is before 2002 and the ShipDate is after 2002. Finally if I make a multiselect when Date is on a filter, for example if I multiselect the 1st January 2002, the 5th of January 2002 and the 10th January 2002, I want to count the number of rows where the OrderDate-ShipDate range includes any of those three dates.

I spent several hours thinking how I could do this in DAX using just the columns I had and I came up with the following formula, which gave me the results I was expecting… when it ever returned, because it was horrendously slow:

=CALCULATE(
COUNTROWS(‘FactInternetSales’)
, FILTER(‘FactInternetSales’
, COUNTROWS(
FILTER(
VALUES(‘DimDate'[DateKey]),
(‘FactInternetSales'[OrderDateKey]<=’DimDate'[DateKey])
&&
(‘FactInternetSales'[ShipDateKey]>=’DimDate'[DateKey])
))>0))

It’s using the Calculate function to evaluate the expression CountRows(FactInternetSales); what it’s trying to do in the outermost filter is to filter all of FactInternetSales, and find all the rows where there is at least one Date in the current selection that is between the OrderDateKey and the ShipDateKey. But, as we learned last week, doing a filter on all the columns in a table is very inefficient, and what we actually want to do is to filter on just the columns we’re interested in: OrderDateKey and ShipDateKey. However what I found was that I really needed all of the distinct combinations of OrderDateKey and ShipDateKey, but I could only get either all of the columns in a table using the table name, or the distinct set of values in a single column using the Values() function. Therefore I couldn’t actually filter on the distinct combinations of OrderDateKey and ShipDateKey where OrderDateKey>=DateKey>=ShipDateKey. This seems like a limitation of DAX from what I can see, though it might turn out that there is a way of doing it – if there is I will of course blog about it.

Anyway, the solution was to create a new column that did contain the distinct combinations of OrderDateKey and ShipDateKey. I went into the PowerPivot designer and created a new column called OrderShip on FactInternetSales with the following expression:

=CONCATENATE(‘FactInternetSales'[OrderDateKey],’FactInternetSales'[ShipDateKey])

As you can see, it simply concatenates the OrderDateKey and ShipDateKey values into a 16 character string:

image

I was then able to use that column in a filter as follows, extracting the OrderDateKey and the ShipDateKey from the string using the Left and Right functions, and turning them back into numbers using the Value function:

=Calculate(
CountRows(FactInternetSales)
, FILTER(
VALUES(‘FactInternetSales'[OrderShip])
, COUNTROWS(
FILTER(
VALUES(‘DimDate'[DateKey]),
(‘DimDate'[DateKey]>=VALUE(LEFT(‘FactInternetSales'[OrderShip], 8)))
&&
(‘DimDate'[DateKey]<=VALUE(RIGHT(‘FactInternetSales'[OrderShip],8)))
))>0))

The logic is the same as with the previous expression, with the all-important exception that the outer Filter only works on the distinct values of OrderShip. It works and it’s fast, so I’m relatively happy although I would like to find a more elegant solution.

It not only gives the correct values for individual dates:

but also for Years (and other aggregations of Dates – also notice how the sum of the years is greater than the same as the Grand Total, because of course an order can be in progress in two different years and so needs to be counted in each):

and also for multiselect (where again the value is not the sum of the values of the three individual dates):

I think this illustrates one area where DAX is superior to MDX. MDX is notoriously bad at handling multiselect, but as Marius explained to me in DAX multiselect support is built-in: the Values() function is rather like a CurrentSet function, returning the set of all selected (‘existing’ in MDX terms) values in a column in a particular context.

So, another problem solved… time to think up the next challenge!

9 thoughts on “Solving the ‘Events in Progress’ Problem in DAX, Part Two: Aggregating Dates and Multiselect

  1. What is the equivalent of Excel Row() function in DAX.If I have a Data in PowerPivot and I just want to add a calculated column called say ID which is 1,2,3…as many as there are rows how can this be done with DAX

  2. Hi Chris,

    You assisted me in a previous incarnation of this in MDX, which worked very well as an aside! I’m trying to resolve the same thing in DAX now, with the scenario below to calculate Payment Days

    Measure – Outstanding Amount = 1000

    We then use a reverse running sum method to count back the days of payments taken to reach the current outstanding amount, eg:

    Date Outstanding Amount Payment Amount Payment Days
    1/1/2013 20
    2/1/2013 30
    3/1/2013 50 30 2
    4/1/2013 1000 40 3

    In the above example, the PaymentDays value on the 4/1/2013 would be 3, as it took 3 days to reach the outstanding amount of 1000 (40 + 30 + 30). On the 3/1/2013 the value would be 2, as it only took two days of payments to reach/exceed the value (30 + 30 = 60) of 50.

    Can the above solution be modified to work in DAX?

    Thanks

    Mike

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Hi Mike,

      It sounds doable, although I’d have to think about it! BTW, rather than use the approach described in this post for solving the events-in-progress problem in DAX, I now prefer the approach I wrote about here: http://cwebbbi.wordpress.com/2011/11/10/solving-the-events-in-progress-problem-in-dax-v2-0/

      Chris

      1. Thanks Chris, well if you have any brain waves on your mental schedule let me know 🙂 Thanks for update on the new approach, I’ll check it out also.

        Just when I thought I was confident on MDX, along comes DAX and it all starts again!

        Cheers

        Mike

Leave a ReplyCancel reply