Counting Returning Customers in DAX

As promised, I’m going to be putting up a few more DAX examples here – although they may not demonstrate any new and interesting concepts, I’m still learning the language and blogging is a good way for me to practise implementing common calculations.

So, today’s problem is: how do we find the number of distinct customers who bought something in the current time period and who have also bought something in the past? That’s to say we want to count the number of returning customers, as opposed to completely new customers who have never bought anything from us before. This is, basically, a variation on the problem of how to get a distinct count in DAX that Marco has already dealt with comprehensively here, but with aspects of a time intelligence calculation (a topic which is well covered here). I’ll be using two tables from Adventure Works to illustrate this: FactInternetSales and DimDate.

What we need to start off with is find the set of distinct customers who bought something in the current time period. The following simple DAX expression finds this set and gives me the distinct count:

=COUNTROWS(DISTINCT(FactInternetSales[CustomerKey]))

We now need to filter these customers so that we only return the ones who bought something in the range of dates from the very first date we have data for, up to the day before the first date in the current time range. This can be accomplished with the DatesBetween, FirstDate and DateAdd functions. With the DatesBetween function, if you pass a Blank value to the first parameter it will give you the first date you have data for as your start date (see here for another example of this); for the end date in the range, we find the first date in the current date range with FirstDate, then get the day before using DateAdd:

DATESBETWEEN(
DimDate[FullDateAlternateKey]
,BLANK()
, DATEADD(FIRSTDATE(DimDate[FullDateAlternateKey]), -1, DAY)
)

We then need to use this date range inside the Filter function as follows:

=CALCULATE(COUNTROWS(DISTINCT(FactInternetSales[CustomerKey]))
, FILTER(DISTINCT(FactInternetSales[CustomerKey])
, CALCULATE(SUM(FactInternetSales[SalesAmount]),
DATESBETWEEN(DimDate[FullDateAlternateKey]
,BLANK()
, DATEADD(FIRSTDATE(DimDate[FullDateAlternateKey]), -1, DAY))
, ALL(DimDate))>0))

However, there’s one last problem to solve: on the first date in the time dimension using the DateAdd function in this way will return a Blank value, and using Blank as both a start and an end date will result in us getting all the dates in the column back from the DatesBetween function. This means that for this first date we get the same value as the distinct count because we are finding the set of all customers who bought something on that first date and applying a filter to see whether these customers bought something on any date, which of course they did:

image

So we need to use the IF function to check if using DateAdd to get the previous date in this way returns a Blank, and if it does to return a Blank. Here’s the final expression:

=IF(
DATEADD(FIRSTDATE(DimDate[FullDateAlternateKey]), -1, DAY)=BLANK()
, BLANK()
,CALCULATE(COUNTROWS(DISTINCT(FactInternetSales[CustomerKey]))
, FILTER(DISTINCT(FactInternetSales[CustomerKey])
, CALCULATE(SUM(FactInternetSales[SalesAmount]),
DATESBETWEEN(DimDate[FullDateAlternateKey]
,BLANK()
, DATEADD(FIRSTDATE(DimDate[FullDateAlternateKey]), -1, DAY))
, ALL(DimDate))>0)))

image

One last point: even though I’ve written this expression on the RC version of PowerPivot, I still need to use ALL(DimDate) on the last line I am joining FactInternetSales and DimDate using the OrderDateKey column, which is an integer surrogate key. This wouldn’t be necessary if I was joining the two tables using a datetime column. Personally I think this is going to confuse no end of people and is a bit rubbish – especially because, in many data warehouses, the only way to join a fact table and a time dimension table is using an integer surrogate key; even worse, if you try to bring a datetime column from a time dimension table down onto the fact table using a calculated column, you’ll get a circular reference error. Hopefully this will get fixed in the next release…

UPDATE: I’ve found a better way of doing this in DAX. Instead of using the Filter function, which can be a bit slow, you can just use the filter arguments of the Calculate function itself. Here’s the new version:

=IF(
CALCULATE(COUNTROWS(), DATEADD(FIRSTDATE(DimDate[FullDateAlternateKey]), -1, DAY))=0
, BLANK()
, CALCULATE(
COUNTROWS(DISTINCT(vwFactInternetSalesWithDate[CustomerKey]))
,DISTINCT(vwFactInternetSalesWithDate[CustomerKey])
, DATESBETWEEN(DimDate[FullDateAlternateKey]
,BLANK()
, DATEADD(FIRSTDATE(DimDate[FullDateAlternateKey]), -1, DAY)
)))

Here, the two filter arguments I’m using are:

  1. DISTINCT(vwFactInternetSalesWithDate[CustomerKey]), to filter by the customers who bought something in the current time period, and
  2. DATESBETWEEN(DimDate[FullDateAlternateKey], BLANK(),  DATEADD(FIRSTDATE(DimDate[FullDateAlternateKey]), -1, DAY)), to filter by the set of all dates up to the day before the first date in the current period

ANDing these two arguments together gives us a context that contains only rows that have a customer that bought in the current time period and dates up to the current time period. It’s a little harder to understand what’s going on here but a lot more elegant, I think.

 

18 thoughts on “Counting Returning Customers in DAX

  1. I am doing a scorecard solution with PowerPivot. I am calculating my own status indicators, as PowerPivot KPIs do not support the status indication calcs I am looking for. I am past that :). Now I want to come up with a measure that calculates ‘how green’ is the scorecard we are looking at. Do you have any good advice on how I can count rows in a table that a referencing a measure?

  2. Hi Chris,

    Just wondering if you can share your knowledge, I’m trying to get the number of unique customers based on the start and end dates of their subscription. I can do this in SQL but am having trouble getting it to work in DAX. Basically this is the query I’m using but this doesn’t seem to work for me:

    =CALCULATE(DISTINCTCOUNT(Details[Customer ID]),
    DATESBETWEEN(
    Dates[Date]
    ,Details[SubscriptionStartDate]
    ,Details[SubscriptionEndDate]
    )
    )

    So when I use this in PowerPivot I want to see the fiscal month in my horizontal row labels and then the unique customer count for each period. My date table is joined to my detail table using the invoice date, and I have 2 more inactive joins to the SubscriptionStartDate and SubscriptionEndDate columns within the Detail table.

    Hope you can help.

  3. Hi Chirs

    I’m new to this one and trying to grab the concept of tabular.. Why is the Grand Total is not showing for returning customer?

    Thanks

    1. That’s because the Grand Total represents all years – and at that level, because there is no previous year, there can’t be any returning customers.

  4. Hi Chris,

    I”m using named set in my mdx queries for new customers so that I could use this named set on row level further to see another measures like Sales and Margins etc. My queries is as follows.

    With
    Set [NewCustomer_DateCNQtrBase] AS

    EXCEPT(
    NONEMPTY(
    [Customer].[Customer].[Customer Name].MEMBERS
    , ([Measures].[Freight],[Date_CN].[Date_CN].CurrentMember))
    ,
    NONEMPTY(
    [Customer].[Customer].[Customer Name].MEMBERS
    , ([Measures].[Freight],[Date_CN].[Date_CN].CurrentMember.Lag(8):[Date_CN].[Date_CN].CurrentMember.Lag(1)))
    )

    SELECT non empty
    {[Measures].[Freight]} ON 0,
    non empty[NewCustomer_DateCNQtrBase]
    ON 1
    FROM [DARCL]
    where ([Date_CN].[Date_CN].[Financial Year].&[2016-17].&[Quarter-I])

    It works fine in mdx but I need to do same thing in my tabular model for new customers and top 20 customers. So can we make named set in DAX? If named set functionality is not available in DAX then whats the best alternate of this? I tried to use Calculated Tables but it filter data on company level and stored it in memory after process and when I use it further in excel sheet then it does not work on another filter dynamically. Thanks for any help on it.

  5. Hi Chris!

    I need to calculate returning customers but bucket them by if they are current(Shopped within last 12 months) or lapsed(haven’t shopped in last 12 months) or former(haven’t shopped since last 25months or more). I am using dynamic segmentation to created this segments and defined min and max of months that I intend to use in my date filters. the below query works for me if I just want to calculate the number of one type of customer. I am struggling to write DAX that helps me define one metric that calculates the number of all the segments. I want to be able to replace -24 and -13 with min and max from my segment. Any help would be appreciated.

    IF(
    CALCULATE(COUNTROWS(‘AMC PERIOD’), DATEADD(FIRSTDATE(‘AMC PERIOD'[CALENDAR_DT]), -1, DAY))=0
    , BLANK()
    , CALCULATE(
    COUNTROWS(DISTINCT(‘Transaction'[Indiv_ID]))
    ,DISTINCT(‘Transaction'[Indiv_ID])
    , DATESBETWEEN(‘AMC PERIOD'[CALENDAR_DT]
    ,DATEADD(FIRSTDATE(‘AMC PERIOD'[CALENDAR_DT]), -24, MONTH) , DATEADD(FIRSTDATE(‘AMC PERIOD'[CALENDAR_DT]), -13, Month)
    )))

    Thanks!
    Monali

  6. i have two table sales_order_item and date table…

    active customers gross:=

    CALCULATE(DISTINCTCOUNT(sales_order_item[id_customer]),sales_order_item[status_code]>=1)

    new customer gross:=

    CALCULATE(DISTINCTCOUNT(sales_order_item[id_customer]),sales_order[order_rank]=1,sales_order_item[status_code]>=1)

    i want a dax query to find the new customer behaviour.. a customer who is new for month of january 2020, whether he had purchased anything on next month (feb 2020)…so basically i want to find the how many new customers from january2020 have became active customers in february 2020…

Leave a Reply to Chris WebbCancel reply