PowerPivot Top N Reports Using Excel Cube Formulas

Top N reports are an extremely common requirement: my customers are always trying to find their top 10 products or sales people or geographies by some measure or other. Luckily this type of report is fairly easy to build in PowerPivot if you’re using a PivotTable; in fact, Rob Collie wrote a good blog post on this subject only last week which is well worth a read. The problem with PivotTables is, however, that they are a pain to format and many people prefer to use Excel cube formulas for their dashboards – and unfortunately dynamic Top N reports are surprisingly difficult to implement with cube formulas. As the discussions here and here show, even when you’re using CubeSet and CubeRankedMember you need to know MDX pretty well and even then it’s a bit messy. Here, instead, is a pure DAX solution to the problem which, while not as simple as I’d like, involves no MDX, no clever use of Excel cube functions, and works when you select more than one item in a slicer.

The first thing you need to do is to create a table with as many rows in as you need items in your Top N report. In my example I’m going to return the top 10 products in a model built from the Adventure Works database, so here’s my table (called TopNRank):

image

Here’s my model in Diagram View:

image

And here’s a screenshot of my main worksheet, for reference, with two Slicers on CalendarYear and EnglishProductCategoryName; a PivotTable with a Top 10 filter applied on EnglishProductName (to check the output and for debugging); and below it my Excel formulas, with the ten values from the TopNRank table on rows and two measures called [TopN Product Name] and [TopN Product Sales] on columns, showing the same top 10 values:

image

Step 1 is to create a measure called [Sales] that simply sums up the values in the [Sales Amount] column:

Sales:=SUM([SalesAmount])

You can then create a measure, called [Product Rank] here (and shown in the PivotTable above), that returns the rank of each product by [Sales] for the current year and category:

Product Rank:=
IF(
  ISBLANK([Sales])
    , BLANK()
    , RANKX(ALL(DimProduct[EnglishProductName]), [Sales], [Sales], 0, Dense)
)

The basic idea for this approach is that with the Excel cube formulas, you’re going to use the values from the TopNRank table on rows and then use a measure to return the name of the top Nth Product for each row. This measure needs to return the name of the product that has the same rank value as whichever value from the TopNRank table is on rows. For example, in the screenshot above, in cell D21 there is a CubeMember function that returns the value 1 from TopNRank table; in cell D22 there is a CubeValue function that references the new measure, and this filters the list of all Products to return the name of the Product where [Product Rank] is 1, which is Road-150 Red, 48 (as you can see from the PivotTable).

There’s a problem with this approach, however, and that is that the RankX function always returns tied ranks when two products have the same value for [Sales]. So, in the PivotTable in the screenshot above, there are two products with the rank 2 because they have the same value for the [Sales] measure – and this causes big problems for the approach described in the previous paragraph. Despite what BOL says you can’t calculate a rank by more than one column, so the only way to get around this is to ensure that tied ranks can never occur, and the way I’ve done this is to rank by [Sales] and the name of the product by using the following measures:

Product Name:=
FIRSTNONBLANK(VALUES(DimProduct[EnglishProductName]), DimProduct[EnglishProductName])

Product Name Rank:=
IF(
  ISBLANK([Sales])
    , BLANK()
    , RANKX(ALL(DimProduct[EnglishProductName]),[Product Name])
)

Combined Rank:=
[Product Rank] + (1/[Product Name Rank])

Untied Product Rank:=
RANKX(ALL(DimProduct[EnglishProductName]), [Combined Rank],,1)

With this done, at long last it’s possible to create the measure that returns the name of the Top Nth product as follows:

TopN Product Name:=
IF(
  ISFILTERED(‘TopNRank'[TopNRank]) && ISBLANK([Sales])=FALSE()
    , FIRSTNONBLANK(
        FILTER(VALUES(DimProduct[EnglishProductName])
        , [Untied Product Rank]=VALUES(‘TopNRank'[TopNRank]))
    , DimProduct[EnglishProductName])
  , BLANK()
)

And here’s the measure that returns the value of [Sales] for each product:

TopN Product Sales:=
IF(
  ISFILTERED(‘TopNRank'[TopNRank]) && ISBLANK([Sales])=FALSE()
  , CALCULATE(
    SUM(FactInternetSales[SalesAmount])
    , FILTER(
      VALUES(DimProduct[EnglishProductName])
      , DimProduct[EnglishProductName]=[TopN Product Name]))
    , BLANK()
)

I’ve been told by the customer that implemented this approach that performance on larger models, while acceptable, is a bit slow and that it gets worse the more items you display in your top n list. This doesn’t surprise me and to be honest I’ll need to do some experiments to see if I can improve performance.

You can download my sample workbook (Excel 2010 64 bit, PowerPivot V2.0) from here.

3 thoughts on “PowerPivot Top N Reports Using Excel Cube Formulas

  1. U are a genius, I wasted around an hour on issue of same Ranks in a result set and finally got the solution here. Logic Rank1+1/Rank2 is gr8. Great Job. 🙂

  2. This is a wonderful explanation and I thank you for posting it. I have a question – is it possible to parameterize this setup to allow the user to select specific measures and particular sort fields based on disconnected slicers?

    For key locations I would like our users to use a single CUBE-based dashboard that would give them the top 10 according to two selected dimensions (plus the usual collection of slicers filtering the data like data range, PO Type, …).

    A) Metric (e.g. measures)
    Spend
    Quantity Ordered
    Purchase Order Count

    B) Top 10 sorted by (e.g. columns)
    Vendor
    Item
    Location

    I can build the metric (A) disconnected table and assign the correct measure, but I can’t seem to do the same with (B) as I haven’t figured how the DAX variable measure can take a field identifier as a value. Even for the variable measure (A), when I enter it into the DAX formulas you have above I get an error that DAX can’t resolve the value of the measure.

    I then tried, based on another of your columns, to create an MDX query in CUBESET, and was able to build strings for the TopN values such as [Orders].[Item Description] but it doesn’t appear I can get the *other* slicers (date range, PO Type, etc) into the CUBESET function without ending up hardcoding those too, and there are far too many!!

    So, I can build 3×3 measure sets as you have above, but that’s certainly a daunting task even before thinking about maintenance… Perhaps this is something DAX isn’t ready to do, but I’m hoping it’s something I just don’t know how to do!

Leave a Reply