Using The CubeSet() Function to Combine Selections From Multiple Slicers in Excel Power Pivot Reports

Since my last post on the Excel cube functions proved to be very popular, I thought I’d write a follow-up that shows off some more little-known but interesting functionality: how to use the CubeSet() function to combine selections from multiple slicers.

The Problem

Consider the following Power Pivot model based on data from the Adventure Works DW database:

image

The Internet Sales table contains one row per sales transaction; Product and Date are dimension tables that provide more information on the product bought in the transaction and the date the order was taken on. There’s one measure called Distinct Customers that returns the distinct count of customers; its definition is as follows:

Distinct Customers:=DISTINCTCOUNT('Internet Sales'[CustomerKey])

With this model you could build the following simple report by converting a PivotTable to cube functions:

image

Obviously this will tell you how many distinct customers bought the products selected in the slicer each year. You could also duplicate the slicer and the cube functions and have two reports side-by-side so you could compare the number of customers that bought two different groups of products:

image

The question here is, though, how can you create a third report that shows the number of distinct customers that bought any of the products selected across both slicers? So, using the example above, the report on the left shows the number of customers that bought Mountain-200 Black 38, Mountain-200 Black 42 or Mountain-200 Black 46; the report on the right shows the number of customers that bought Mountain-200 Silver 38, Mountain-200 Silver 42 or Mountain-200 Silver 46; the aim is to create a third report that shows the number of distinct customers that bought Mountain-200 Black 38, Mountain-200 Black 42, Mountain-200 Black 46, Mountain-200 Silver 38, Mountain-200 Silver 42 or Mountain-200 Silver 46 without having to select all those products in a third slicer.

The Solution

Step 1

The first thing to point out is that you can use the CubeSet() function to return the set of items selected in an Excel slicer. This is a widely-known technique – Rob Collie has a great post describing it here. What I didn’t realise until recently though was that one CubeSet() function can point to other cells containing CubeSet() functions and union all the items returned by those CubeSet() functions. This allows you to combine the selections made in multiple slicers, assuming that those slicers refer to the same field in your Power Pivot model. Here are the formulas in three cells that do this:

image

In cell I2 the formula uses the CubeSet() function to capture the selection made in the left-hand slicer in the report above, captioned Product Selection A, which has the name Slicer_Product. In cell I3 there is a similar formula to capture the selection made in the slicer on the right in the report, captioned Product Selection B, which has the name Slicer_Product1. The formula in I4 then uses a third CubeSet() function to combine the sets returned by the last two formulas into a third set.

Step 2

Now that you have a single CubeSet() formula in I4 that combines the selections in both slicers, you can use the trick I showed in my last post and reference that cell in CubeValue() formulas to get the number of distinct customers that bought any of the products listed in either slicer:

image

Here’s what the report looks like at this point:

image

Step 3

Notice that, thanks to the power of Power Pivot, customers are not double-counted in the new combined selection report on the right-hand side. For example, in 2003 841 customers bought one or more products selected in the slicer Product Selection A while in 2003 796 customers bought one or more products listed in the slicer Product Selection B. 841+796=1637, but in the combined selection report correctly states that only 1588 distinct customers bought products listed in either Selection A or Selection B. We can use these numbers to do one final, useful calculation: 1637-1588=49, so there were 49 customers who must have bought products in both Selection A and Selection B. The formula to do this is pretty straightforward and looks like this:

=IFERROR(($C19+$G19)-$J19, 0)

image

image

You can download the sample workbook for this post here.

Creating OR Filters With Cube Functions For Power Pivot And Analysis Services Reports In Excel

The Excel Cube Functions are incredibly powerful, and I’m still amazed at the kind of problems they can solve. This post describes how they can be used to build a report with a complex OR filter using data from Power Pivot (it’s equally applicable to SSAS) that shows a number of advanced uses of these functions. No knowledge of MDX or DAX is needed but if you’re new to the Excel Cube Functions I recommend that you watch this video of a presentation by Peter Myers, which provides an excellent introduction to them.

The Problem

Imagine you’ve got a very simple Power Pivot model that looks like this:

image

There’s a Date table with dates, months and years in, and an Internet Sales table with sales data in and a measure called Sales Amount. Here’s what the data looks like in a PivotTable:

image

Now, imagine that you want a report with the Sales Amount measure on columns and Years on rows, and you want to filter the data so that you only see values for Mondays in July or Wednesdays in September. Using the Fields, Items and Sets functionality you could filter the data to only show the day/month combinations you need for each year, but since you can’t put a named set into the Filter area of a PivotTable you would have to use Excel formulas to sum up the combinations to get the totals you need:

image

Lukcily it is possible to build the report you need using the Cube Functions! Here’s how:

Step 1: Build Your Combinations Using CubeMember()

The first thing to point out is that the CubeMember() function does not have to just return a member, it can return a combination of members (in MDX this is known as a tuple). These combinations can be built in several ways, one of which is by using cell references to other cells that themselves contain CubeMember() functions. It’s probably easier to explain this by showing a worksheet that contains six cells with CubeMember() functions in. Here it is with the formulas visible:

image

Cells B5 and B10 contain references to days of the week; cells B6 and B11 contain references to months. Cells B7 and B12 contain CubeMember() functions that return the combinations we want to filter by: Mondays in July and Wednesdays in September respectively.

Here’s what the formulas above return:

image

You’ll notice that the ‘combination’ cells only show the month names, not the day/month combinations – this is just a feature of the CubeMember() function and can be a bit misleading, but rest assured they do return the combinations you need.

Step 2: Build A Set Using CubeSet()

The CubeSet() function is also able to build sets using cell references to cells containin CubeMember() functions. In this case I want a set containing the two ‘combination’ CubeMember() functions from B7 and B12. I can do this by using the formula:

=CUBESET($B$2,($B$7,$B$12), "The set of combinations")

Here are the formulas on the worksheet at this point:

image

And here’s the output:

image

Step 3: Reference The CubeSet() Function In Your Report

Now you have a CubeSet() function that returns the two day/month combinations, you can use this in a cube function report. When you reference a cell containing the CubeSet() function in a CubeValue() formula, the CubeValue() formula will return the aggregated value of all of the combinations in the CubeSet(). So for example, here’s a report with the Sales Amount measure on columns, Years on rows, and displaying the Sales Amount for each year filtered by the two day/month combinations:

image 

And here’s the actual output:

image

Compare the numbers from the report at the bottom with the values calculated from the PivotTable in the screenshot earlier in this post, and you’ll see that we have indeed shown just the combined Sales Amount for Mondays in July and Wednesdays in September, broken down by Year.

You can download the example Power Pivot workbook for this post here.

Excel 2016 PivotTable MDX Changes Lead To Big Query Performance Gains

Yesterday, the Excel team announced a bunch of performance improvements in Excel 2016 for PivotTables connected to Analysis Services (Multidimensional or Tabular), Power Pivot and Power BI. If you haven’t read the official blog post already, here it is:

https://blogs.office.com/2016/07/07/faster-olap-pivottables-in-excel-2016/

In that post, Microsoft point out that how much of a performance increase you get will depend on a number of factors. I guess they have to do this to manage expectations. However I’m going to come right out and say that these changes are probably the most important thing that has happened in the last five years for Analysis Services or Power Pivot query performance and if you are using Excel PivotTables with SSAS or Power Pivot for reporting, you should upgrade to Excel 2016 (you’ll need the Office 365 click-to-run version) just for these changes.

The key improvement that Microsoft have made here is to stop Excel bringing back unnecessary subtotal values in the MDX queries that it generates. This has been a gigantic problem for years and several people have blogged about it: I did here, and among other people Rui Quintino and Richard Lees have posts on the subject that are worth reading. A lot of my consultancy work involves SSAS and MDX performance tuning and I think I see around five to ten customers every year who are seriously affected by this – and I’m sure this is just the tip of the iceberg. Even if you don’t think you are affected, you may have users who are putting up with slow refresh times for Excel-based reports without your knowledge.

Let’s go into some technical detail, see what the original problem was and what has changed in Excel 2016. I’m going to provide some examples using Excel 2013/2016 against SSAS Multidimensional but as I said the same problems occur (although less severely) with SSAS Tabular and Power Pivot.

Consider the following Excel 2013 PivotTable connected to the Adventure Works cube:

image

image

It’s typical of the kind of thing an end user wants to build, and there are two things to point out:

  1. Notice there are three fields that have been placed on the Rows axis of the PivotTable: the Gender and Marital Status hierarchies from the Customer dimension, and the Category hierarchy from the Product dimension. It’s pretty common for users to put many more fields together on an axis when they want to display detailed data.
  2. The user has also turned off grand totals and subtotals in the PivotTable so that only the detailed data is shown. This often happens in combination with (1) because if the user only wants to see the detailed data the subtotals and grand totals are a distraction.

In this case the PivotTable layout has been switched Tabular and the Repeat All Item Labels option is selected to make the PivotTable look like a table, but these are just aesthetic changes.

This PivotTable contains 12 rows, but if you look at the MDX query that is generated by Excel 2013 to populate it, it returns 36 rows of data (actually the MDX in this case returns 36 columns, but that’s a quirk of Excel) because the subtotals and grand total that Excel is not displaying are still being returned. Here’s the query:

SELECT 
NON EMPTY 
CrossJoin(
CrossJoin(
Hierarchize(
{DrilldownLevel(
{[Customer].[Gender].[All Customers]}
,,,INCLUDE_CALC_MEMBERS)}), 
Hierarchize(
{DrilldownLevel({[Customer].[Marital Status].[All Customers]}
,,,INCLUDE_CALC_MEMBERS)})), 
Hierarchize(
{DrilldownLevel({[Product].[Category].[All Products]}
,,,INCLUDE_CALC_MEMBERS)})) 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME 
ON COLUMNS  
FROM 
[Adventure Works] 
WHERE ([Measures].[Internet Sales Amount]) 
CELL PROPERTIES VALUE, FORMAT_STRING, 
LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

And here’s a sample of what it returns, with the unwanted values highlighted:

image

You can probably guess that these unwanted subtotals and grand totals make the query slower, and the more fields you put together on the rows or columns of a PivotTable the greater the number of subtotals/grand totals are returned and slower things get. PivotTables connected to SSAS or Power Pivot often become completely unusable with more than six or seven fields put together on the same axis, and the only workarounds before Excel 2016 are either to write custom MDX in a named set (I show how to do this here) or if you are using SSAS Multidimensional try to use member properties instead of separate fields (as Richard Lees shows here), but neither are really great alternatives.

Building the same PivotTable using Excel 2016, however, gives the following MDX:

SELECT 
NON EMPTY 
CrossJoin(
CrossJoin(
Hierarchize(
{[Customer].[Gender].[Gender].AllMembers}), 
Hierarchize(
{[Customer].[Marital Status].[Marital Status].AllMembers})), 
Hierarchize(
{[Product].[Category].[Category].AllMembers})) 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME 
ON COLUMNS  FROM [Adventure Works] 
WHERE ([Measures].[Internet Sales Amount]) 
CELL PROPERTIES VALUE, FORMAT_STRING, 
LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

Which returns just the data needed for the PivotTable:

image

What has changed? Focusing on just the Gender hierarchy, the difference between the two queries is that in Excel 2013 the selection on Gender is given using the DrillDownLevel() function on the All Member of the hierarchy. Isolated and put in its own query, the set expression used looks like this:

SELECT 
{[Measures].[Internet Sales Amount]}
ON COLUMNS,
DrilldownLevel(
{[Customer].[Gender].[All Customers]}
,,,INCLUDE_CALC_MEMBERS)
ON ROWS
FROM
[Adventure Works]

image

Notice how the All Member, used by Excel to return subtotals and grand totals, is returned by the set. Excel 2016 instead just asks for the members on the Gender level of the Gender hierarchy, not including the All Member. Again, put inside its own query, you can see what it returns:

SELECT 
{[Measures].[Internet Sales Amount]}
ON COLUMNS,
{[Customer].[Gender].[Gender].AllMembers}
ON ROWS
FROM
[Adventure Works]

image

Apart from the fact that no All Member is returned, it’s also a lot cleaner and easier to read.

I’ve been very excited about this change since I first heard it was happening, and I think it will benefit a lot of people. A big thanks to Alexander Lahuerta and the rest of the Excel team for making this happen!

Finding All Selected Items In A Slicer In Excel 2016 Using TextJoin()

When you are using slicers with an Excel PivotTable it’s often useful to be able to get a comma-delimited list of the items selected in that slicer for use in a report title. It’s not easy to do though, and in fact this is one of those topics that lots of people have blogged about over the years: here’s my MDX approach, here’s Erik Svensen’s post on using the new DAX ConcatenateX() function, and there are also posts by Rob Collie like this one. None of these techniques are ideal though: my personal favourite is the ConcatenateX() approach, but that only works with SSAS Tabular 2016 (and then only if you can create a measure on the model) or Power Pivot in Excel 2016, and not at all if you’re using SSAS Multidimensional or earlier versions of SSAS Tabular.

However, after discovering the new TextJoin() function in Excel 2016 the other week I realised that this would provide yet another way to solve this problem. Here’s a simple example using a PivotTable and slicer connected to a Power Pivot model:

image

The highlighted cell F3 showing a comma-delimited list of all the items selected in the slicer has the following Excel formula:

=TEXTJOIN(
    ", ",
    TRUE,
     IFERROR(
      CUBERANKEDMEMBER(
        "ThisWorkbookDataModel",
        Slicer_Product,
        ROW(
         INDIRECT("1:"&CUBESETCOUNT(Slicer_Product))
        )
       ), 
      "")
     )

Important: this needs to be entered as an array formula, so instead of hitting Enter after typing in the formula you need to hit Ctrl+Shift+Enter. You’ll see the formula surrounded by braces {} in the formula bar when you do this:

image

image

This formula relies on the fact that the selection in a slicer (in the example above the slicer has the name Slicer_Product) can be treated the same as the output of the Excel CubeSet() function, which means that you can use the CubeSetCount() function to find the number of items selected and the CubeRankedMember() function to get the name of any single item in the selection. It also uses the Row()/Indirect() trick described here to create an array of numbers from 1 to the number of items selected in the slicer, which in turn provides the rank values to pass to the CubeRankedMember() function.

The beauty of this approach is that it works for Power Pivot and all versions of SSAS Tabular and Multidimensional, and doesn’t require any measures to be created on your models/cubes. It even works in Excel Online, so it will work inside Power BI, although it doesn’t seem to be possible to create array formulas in Excel Online yet so you need to create the formula on the desktop before you deploy. Of course you need the latest build of Excel 2016 for all this to work, and at the time of writing most people don’t have Excel 2016 and even if they do they probably won’t have a build (Version 16.0.6568.2025 or higher) with TextJoin() in it yet. But this will be a great solution in the distant future when everyone has Excel 2016, I promise!

You can download the sample Excel 2016 workbook here.

I also have to acknowledge the help of David Hager in writing this formula – we had a conversation about how TextJoin() behaves in array formulas in the comments of my earlier post and in doing so he provided the basic approach for me.

Automatically Generating Date Dimension Tables In Excel 2016 Power Pivot

As you probably know, whenever you are doing any kind of date or time-based calculations in DAX you should always have a separate Date table in your Power Pivot model. There are a number of ways of building these tables (see, for example, my Power Query query here) but they are all a bit of a hassle – which is why it’s so cool that, in Excel 2016, you can get one built automatically inside the Power Pivot window.

Consider the following table of sales data on an Excel worksheet:

Source

With this table loaded into the Data Model (and the Order Date column recognised as containing data of the Date data type), when you go to the Design tab in the Power Pivot window you’ll see the new Date Table button enabled:

Menu

Clicking on the New button will add a new date table to the Data Model, called Calendar:

CalendarBasic

This table is automatically marked as the Date Table in your model.

The table contains a continuous range of dates starting from the beginning of the year of the earliest date found in any column in any table in your Data Model, up to the end of the year containing the latest date found in any column in any table in your Data Model. Obviously, this means that your table could contain a very large date range if, for example, you have a Customer table containing a Date Of Birth column. Luckily, you also have the option of manually configuring the range of dates used by clicking the Update Range button:

Range

One other thing to point out is that the resulting table is a table like any other, so you can add, delete or rename columns as you wish. You should also be able to set the table back to its default state by using the Set Default menu option, but I couldn’t make that work (possibly it hasn’t been implemented yet – this post was written using the Excel 2016 Preview).

If you do make changes like adding calculated columns, such as the Month Year calculated column shown below:

Calc

You can then click the Save Configuration button to save the current state of the table as your default. This means that the next time you create a new Date table in the same workbook, the table will include any customisations. However these changes don’t seem to be applied in Date tables created in new workbooks – maybe this will also change before RTM?

All in all, this is a very handy feature that will save Power Pivot modellers a lot of time. I wonder if it uses the new Calendar() or CalendarAuto() DAX functions under the covers?

Using DateDiff() To Calculate Time Intervals In DAX

One of the most useful new additions to DAX in Excel 2016 and the Power BI Designer is the DateDiff() function. It does exactly what you would expect: calculate the amount of time in between two dates, and express that value as either seconds, minutes, hours, days, weeks, months, quarters or years.

Here’s a very simple table of dates:

image

With this table loaded into the Power BI Designer, you can add new calculated columns to the table by clicking the New Column button on the ribbon. Here are two calculated column definitions that give the number of days and the number of years between the Start Date and the End Date on each row:

DayDurationColumn = 
DATEDIFF(MyTable[Start Date], MyTable[End Date], DAY)

YearDurationColumn = 
DATEDIFF(MyTable[Start Date], MyTable[End Date], YEAR) 

The output is pretty much what you’d expect:

image

It is of course also possible to create measures that use the DateDiff() function to, for example:

YearDurationMeasure = 
DATEDIFF(
FIRSTDATE(MyTable[Start Date]), 
LASTDATE(MyTable[End Date]), 
YEAR)

image

All very straightforward, then, and much easier than having to calculate these values yourself.

Using SelectColumns() To Alias Columns In DAX

A few years ago I wrote this post on how to alias columns in a table in DAX, using a combination of AddColumns() and Summarize(). The good news is that in Excel 2016/the Power BI Designer/SSAS Tabular 2016 there’s a new DAX function specifically for this purpose: SelectColumns(). Here’s an example of how it can be used:

Imagine you have the following source table, called Products:

image

You can write a DAX query to get all the rows and columns from this table like so:

EVALUATE Products

Here’s the output of that query in DAX Studio (and remember, DAX Studio can connect to data loaded into the Power BI Designer, which is what I’m doing here):

image

You can alias the columns in this table using SelectColumns() very easily, like so:

EVALUATE
SELECTCOLUMNS (
    Products,
    "Column One", Products[Product],
    "Column Two", Products[Colour]
)

Here’s the output:

image

The syntax for SelectColumns() is straightforward: the first parameter is a table expression, and after that there are pairs of parameters consisting of:

  • A new column name
  • An expression returning a column from the table given in the first parameter

As you can see in the output of the query above, I’ve renamed the Product column “Column One” and the Colour column “Column Two”.

This means I can now crossjoin a table with itself without needing to worry about conflicting column names, like so:

EVALUATE
CROSSJOIN (
    Products,
    SELECTCOLUMNS (
        Products,
        "Column One", Products[Product],
        "Column Two", Products[Colour]
    )
)

image

One other interesting thing to note about SelectColumns() is that it allows you to do projection in a DAX query easily – as Marco notes here, it was possible before but it wasn’t pleasant. For example, the query:

EVALUATE
SELECTCOLUMNS (
    Products,
    "Just Colour", Products[Colour]
)

Returns:

image

Notice how there are three rows in the output here and that the value Green occurs twice. If you’re a true DAX afficionado, you might get excited about that.