Introduction to MDX for PowerPivot Users, Part 4: Filtering

Chris Webb MDX, PowerPivot November 5, 2012December 18, 2012 4 Minutes

The combination of the filtering functionality built into PivotTables, and the ability to delete and reorder tuples in a set without needing to edit the set expression itself that the Excel named set functionality gives you, means that you can usually implement the filters you need in PowerPivot without needing to resort to MDX. However there are some scenarios where knowing the MDX functions that allow you to filter a set are useful and in this post I’ll show a few of them.

FILTER()

The Filter() function is the Swiss-Army penknife of filtering in MDX: it can do pretty much anything you want, but isn’t always the most elegant method. It’s quite simple in that it takes two parameters, the set that is to be filtered and a boolean expression that is evaluated for every item in the set and which determines whether that item passes through the filter or not.

Here’s a simple example. Consider a simple PivotTable (using my example model, described here) with FullDateAlternateKey on rows and the Sum of SalesAmount measure on columns:

The set of members on the FullDateAlternateKey level of the FullDateAlternateKey hierarchy can be obtained by using the .Members() function as I showed earlier in this series:
[DimDate].[FullDateAlternateKey].[FullDateAlternateKey].MEMBERS

This set can then be filtered by passing it to the Filter function, which itself returns a set, so it too can be used to create a named set. Let’s say we only wanted the set of dates where Sum of SalesAmount was greater than £10000; we could get it using the following expression:

Filter(
[DimDate].[FullDateAlternateKey].[FullDateAlternateKey].MEMBERS
, ([Measures].[Sum of SalesAmount])>10000)

What I’m doing is passing the set of all dates into the first parameter of Filter() and then, in the second parameter, testing to see if the value of the tuple ([Measures].[Sum of SalesAmount]) is greater than 10000 for each item in that set.

Here’s the result:

As I’ve mentioned before, it’s very important that you remember to check the ‘Recalculate set with every update’ button if you want the filter to be re-evaluated every time you change a slicer, which you almost always want to do.

Where is the filter function actually useful though? Here’s the same PivotTable but with Color on columns and with only Red and Black showing:

It’s not possible to filter this PivotTable to show only the dates where sales for Black products are greater than Sales of Red products using native functionality, but it is using MDX. Here’s the set expression:

Filter(
[DimDate].[FullDateAlternateKey].[FullDateAlternateKey].MEMBERS
, ([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Black])
>
([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Red])
)

Here I’m doing something similar to what I did in the first example, but now comparing two tuple values for each date: the tuple that returns the value of Sum of SalesAmount for Black products, which is
([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Black])
and the tuple that returns the value of Sum of SalesAmount for Red products, which is
([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Red])

NONEMPTY()

The NonEmpty() function also does filtering, but it’s much more specialised than the Filter() function – it filters items from a set that have empty values for one or more tuples. As with the Filter() function its first parameter is the set to be filtered, but its second parameter is another set, each of whose items are evaluated for each item in the first set. If one item in the second set evaluates to a non empty value for an item in the first set then that item passes through the filter.

That explanation is, I know, quite hard to digest so let’s look at an example. Here’s our PivotTable with no filter applied on rows and all Colors displayed on columns:

If you wanted to see only the rows where there were sales for Black products, you could use the following expression:

NonEmpty(
[DimDate].[FullDateAlternateKey].[FullDateAlternateKey].MEMBERS
, {([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Black])})

If you wanted to see only the rows where there were sales for Black OR Silver products, you could use this expression:

NonEmpty(
[DimDate].[FullDateAlternateKey].[FullDateAlternateKey].MEMBERS
, {([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Black])
,([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Silver])})

If you wanted to see only the rows where there were sales for Black AND Silver products you’d need to use two, nested NonEmpty functions:

NonEmpty(
NonEmpty(
[DimDate].[FullDateAlternateKey].[FullDateAlternateKey].MEMBERS
, {([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Black])})
, {([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Silver])})

TOPCOUNT(), BOTTOMCOUNT(), TOPPERCENT(), BOTTOMPERCENT()

The TopCount() and related functions are, as you’ve probably guessed from their names, useful for doing top N style filters. If you wanted to see the top 10 dates for sales of Black products you could use the following expression:

TopCount(
[DimDate].[FullDateAlternateKey].[FullDateAlternateKey].MEMBERS
, 10
, ([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Black])
)

Notice here how the dates are displayed in descending order for the Black column, but no other – that’s how you can tell that the TopCount() function is doing what you want.

To get the top N dates that provide at least 5% of the total sales across all time for Black products, you can use the following expression:

TopPercent(
[DimDate].[FullDateAlternateKey].[FullDateAlternateKey].MEMBERS
, 5
, ([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Black])
)

The BottomCount() and BottomPercent() function (I’m always reminded of one of the old, old jokes here when I use the BottomCount() function…) do the opposite and return the bottom items in a set, but you need to be careful using them because the bottom items in a set often have no values at all which is not very useful. So, for example, if you wanted to find the bottom 10 dates that have sales for Black products you need to use the NonEmpty() function as well as the BottomCount() function as follows:

BottomCount(
NonEmpty(
[DimDate].[FullDateAlternateKey].[FullDateAlternateKey].MEMBERS
, {([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Black])})
, 10
, ([Measures].[Sum of SalesAmount], [DimProduct].[Color].&[Black])
)

Here I’m taking the set of all members on the FullDateAlternateKey level of the FullDateAlternateKey hierarchy, passing that to the NonEmpty() function to return only the dates that have values for Black products and Sum of SalesAmount, and then getting the bottom 10 of those dates.

In part 5, I take a look at running MDX queries against a PowerPivot model.

Published by Chris Webb

My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel. View all posts by Chris Webb

Published November 5, 2012December 18, 2012

32 thoughts on “Introduction to MDX for PowerPivot Users, Part 4: Filtering”

Pingback: Introduction to MDX for PowerPivot Users, Part 5: MDX Queries « Chris Webb's BI Blog
Pingback: Introduction to MDX for PowerPivot Users, Part 3: The Members() and Crossjoin() functions « Chris Webb's BI Blog
john says:

October 10, 2014 at 12:00 pm

Hi Chris,

Do you have any ideas about the following cubefunction filter problem with two conditions:

I’ve managed to get the cube-function below working. It’s just an exampel, trying to return Revenue from days where the Revenue was greater than 1000. And it works

=CUBEVALUE(“ThisWorkbookDataModel”;”[Measures].[Revenue]”;CUBESET(“ThisWorkbookDataModel”;”{FILTER([Dates].[DateKey].members,([Measures].[Revenue]100,[Measures].[Revenue]<1000))}"))

Leaving the curly braces has no effect whatsoever. Do You have any ideas how to get this work?

Loading...

Reply
1. Chris Webb says:
  
  October 10, 2014 at 12:15 pm
  
  Hi John,
  Does this work?
  =CUBEVALUE(“ThisWorkbookDataModel”;”[Measures].[Revenue]“;
  CUBESET(“ThisWorkbookDataModel”;”{FILTER([Dates].[DateKey].members
  ,([Measures].[Revenue]>100) and ([Measures].[Revenue]<1000))}"))
  
  Loading...
  
  Reply
  1. john says:
    
    October 10, 2014 at 12:21 pm
    
    Yes! I just figured it out also myself 🙂 from https://social.msdn.microsoft.com/forums/sqlserver/en-US/bfa198db-cb51-478c-afaf-81c7317008c4/problem-with-cubeset-filtering.
    
    Great many thanks to you! Awesome blog!
    
    Loading...
john says:

October 10, 2014 at 12:03 pm

It seems that the start of the message disappeared. So the OK working version is: =CUBEVALUE(“ThisWorkbookDataModel”;”[Measures].[Revenue]”;CUBESET(“ThisWorkbookDataModel”;”{FILTER([Dates].[DateKey].members,([Measures].[Revenue]<1000))}"))

Loading...

Reply
john says:

October 10, 2014 at 12:28 pm

I’m still puzzled with a bit different problem. If I use something else than measures as a condition in filter, Excel gives again N/A. Fex. if I want to just filter dates in year 2014:
=CUBEVALUE(“ThisWorkbookDataModel”;”[Measures].[Revenue]”;CUBESET(“ThisWorkbookDataModel”;”{FILTER([Dates].[DateKey].members,[Dates].[YearKey]<2014)}"))
It results in N/A

Loading...

Reply
1. Chris Webb says:
  
  October 10, 2014 at 12:49 pm
  
  This doesn’t work because you’re assuming that [Dates].[YearKey] is something like a column in SQL. It’s not, its the name of a hierarchy and does not return a value that can be compared with anything else. What you want to do here is find the name (or the key) of the current member on that hierarchy, and then make the comparison. If you want the years in 2014, something like
  EXISTS([Dates].[DateKey].[DateKey].members,{[Dates].[YearKey].&[2014]})
  …is your best bet.
  
  Loading...
  
  Reply
Mer says:

January 7, 2015 at 9:34 pm

Hi Chris! I stuck in MDX again. =( Is there any way to return a set with only one column after using GENERATE function?

I have the following formula:

Generate ( EXCEPT([Dates query].[Dates].[Year].MEMBERS, [Dates query].[Dates].UNKNOWNMEMBER),
TopCount (
{
[Dates query].[Dates].CurrentMember*[MFC].[Products].[Company].Members

},

5,
[Measures].[Share_new_Company]
)
)

It always return 2 columns but I need just result of [MFC].[Products].[Company].Members.

Any ideas?

Loading...

Reply
1. Chris Webb says:
  
  January 8, 2015 at 10:00 am
  
  How about
  
  Generate ( EXCEPT([Dates query].[Dates].[Year].MEMBERS, [Dates query].[Dates].UNKNOWNMEMBER),
  TopCount (
  {
  [MFC].[Products].[Company].Members
  
  },
  
  5,
  ([Measures].[Share_new_Company],[Dates query].[Dates].CurrentMember)
  )
  )
  
  Loading...
  
  Reply
  1. Mer says:
    
    January 9, 2015 at 9:10 am
    
    No, strangely it returns companies which weren’t in TOP 5 of any period, but not all companies. But it looks closer to solution I think.
    
    Loading...
  2. Chris Webb says:
    
    January 9, 2015 at 9:41 am
    
    It will return the correct companies, I promise, but it will union the results of the topcount() so you won’t see any duplicates. Also, if you’re using this in a query, remember you’ll need to display the dates somewhere otherwise you won’t see any meaningful measure values.
    
    Another idea would be to use your original expression and then pass the set to the Extract() function.
    
    Loading...
New to this says:

February 21, 2015 at 4:02 am

Hello Chris

Firstly thank you very much for this useful post. I have a question and hopefully if you have some time you can assist me in please?

My objectives is to have a pivot table and a slicer with 2 selections of ‘top 3’ or ‘everything’.

this is what i have now:

iif([measures].[selectiontop]=”top3″,
topcount([tabledata].[department].[department].members,
3,
([measures].[series1],[tabledata].[department])),
[tabledata].[department].members)

Question: How do i go about getting the subtotal if the slicer selection if ‘top 3’? I currently have a subtotal of all department if the slicer selection if everything but not top 3.

Thanking you for your time.

Loading...

Reply
1. Chris Webb says:
  
  February 21, 2015 at 4:28 pm
  
  Is this an expression you’re using in a named set? Which version of Excel are you using?
  
  Loading...
  
  Reply
  1. New to this says:
    
    February 22, 2015 at 7:47 am
    
    Yes Chris, i am still very new to all powerpivot, mdx and dax stuff.
    
    This is the formulas i used in the manage named sets section and i am using Excel 2013.
    
    Loading...
  2. Chris Webb says:
    
    February 22, 2015 at 10:30 am
    
    Unfortunately, with Excel 2013 there isn’t a way of doing this – you would need to be able to create calculated members to get the subtotals I think and this isn’t possible with Excel 2013 and Power Pivot.
    
    Loading...
New to this says:

February 22, 2015 at 10:49 am

Oh, what a shame. Well nothing you can do about it guess. Thanks a lot for your time Chris. Cheers.

Loading...

Reply
GMF says:

September 2, 2015 at 9:30 pm

This seems to be close to what I’m trying to do in Excel 2013 with CUBE functions but I’m having trouble adapting it to my data.
I have “carts” in hospitals that carry various supplies. I have a slicer that lists the carts, and I can return order quantities and spend based on measures. But I’m having trouble returning basic information about the cart itself.
The cart table is CartList.
For example, cart DNW-013 shows up in the slicer. It’s attributes are:
Region = Central
Department = Nephrology
Cart Type = Open Line
etc.

So if I click on DNW-013 in the slicer I want to have a CUBE function to show me that its region = Central.

I can get the values
=CUBESET(“ThisWorkbookDataModel”,”[CartList].[Region].children”)
and I can get a single value using CUBERANKED MEMBER
=CUBERANKEDMEMBER(“ThisWorkbookDataModel”,CUBESET(“ThisWorkbookDataModel”,”[CartList].[Region].children”),1)

But I can’t figure out how to filter the CUBESET so that I can retrieve only the region value for DNW-013 in the slicer!

Loading...

Reply
1. Chris Webb says:
  
  September 3, 2015 at 7:58 pm
  
  I assume you’re using Power Pivot? It sounds like you need to use a DAX text measure, similar to what’s shown here – http://www.powerpivotpro.com/2015/08/so-your-detailedflat-pivot-is-slow-and-doesnt-sort-properly-try-text-measures/
  
  Loading...
  
  Reply
GMF says:

September 3, 2015 at 8:19 pm

Yes and no. I have a PowerPivot data model but I’m trying to build the dashboard using CUBE functions so that the format stays the same for executives when they look at it instead of having the columns vacillate like a yo-yo depending on what filter they choose. I haven’t tried the solution in the link you pointed to, but it sounds like it would do the same column width changes as a pivot table?

Really, isn’t there a standard CUBE format that can give other attributes of a value picked in a slicer from a Dim table? If someone picks a vendor, have the state or phone # appear? If someone picks the cart location code in my earlier example, to get the department and other info is very helpful. I tried some variations of the FILTER() code you had above but just get #NA results. This seems like it would be a very common need.

Loading...

Reply
1. GMF says:
  
  September 4, 2015 at 4:43 pm
  
  After poking around and trying various combinations, this worked for me! I have absolutely no idea why… really it was a monkey banging on a typewriter creating a Shakespeare snippet. If there’s a cleaner way to do this I’d love to know.
  
  =CUBERANKEDMEMBER(“ThisWorkbookDataModel”,CUBESET(“ThisWorkbookDataModel”,”([tblCartList].[Par Location ID].[“&$L$5&”],[tblCartList].[Region].children)”),1)
  
  So for the location code “WHMC-009B” in cell $L$5 it correctly returns “South” as the region.
  
  Cell $L$5 has this formula to retrieve the slicer value:
  
  =IF(OR(CUBESETCOUNT(Slicer_Par_Location_ID)>1,CUBERANKEDMEMBER(“ThisWorkbookDataModel”,Slicer_Par_Location_ID,1)=”All”),”Multiple Selections”,CUBERANKEDMEMBER(“ThisWorkbookDataModel”,$K$5,1))
  
  I don’t know why CUBESETCOUNT accurately returns the slicer selection count if 1 or more values are selected but also returns 1 if the filter is removed. Very annoying. That’s why I also test for the CUBERANKEDMEMBER of the slicer to capture if All is the return value. There’s probably a better way to do that too.
  
  Loading...
  
  Reply
  1. Chris Webb says:
    
    September 5, 2015 at 9:21 pm
    
    I’m glad you got this working – I’m pretty sure that a measure that returns text would be a cleaner way of doing this, though.
    
    Loading...
  2. GMF says:
    
    September 6, 2015 at 1:23 pm
    
    If you’ve got an example I’d love to use it. Perhaps I wasn’t clear on what a newbie I am – I couldn’t figure out how to create such a measure… something using HASONEVALUE and VALUES perhaps?
    
    Loading...
SB says:

February 22, 2016 at 9:54 am

Hi Chris

I’ve got a very peculiar CUBESET problem (related to your article above) and have reached an impasse trying to solve it. Hoping you might be able to shed some light.

Here’s the task I have at hand…

I’m after a set of Customer IDs, that have opted in to a particular product in a given month. When I’ve got this set, I then want to know how much revenue they have generated on a separate product in the same month. Revenue in this instances needs to be sliced by two additional dimensions.

Here’s the initial CUBESET query to get my Customer IDs…

=CUBESET(connection,”NONEMPTY({[Customer].[Customer ID].CHILDREN},([Measures].[No Of Optins],[Calendar].[YQMD Hierarchy].[Month].&[Oct 2015],[Product].[Product Hierarchy].[Product Sub Group].&[Product1]))”)

(where “connection” is a named cell, with the corresponding cube)

I get 143 members in the set, which matches a different measure (not important, but a good sense check) and I can list them out with CUBERANKEDMEMBER

Now, here’s the problem…

If I list the members out using CUBERANKEDMEMBER, I can query the individual members by the two additional dimensions. and get the corresponding revenue.

HOWEVER

If I query the set for revenue by the two additional dimensions I get NULL. I don’t understand why, and listing out the set becomes increasingly impractical if they get bigger.

Is there something wrong with my syntax that would prevent the *set* from being sliced by additional dimensions?

I’m self-taught (for better or worse) so apologies if my terminology is incorrect!

Any insight would be much appreciated.

Loading...

Reply
1. Chris Webb says:
  
  February 22, 2016 at 10:40 pm
  
  If you’re using Power Pivot or SSAS Tabular, why aren’t you doing this in a DAX measure? (if you’re using SSAS Multidimensional you could do this in MDX)
  
  Loading...
  
  Reply
  1. SB says:
    
    February 23, 2016 at 10:38 am
    
    The Excel report I’m building is connecting to a SSAS cube, and needs to be user friendly i.e. the end users can refresh a connection and/or change the month. As such, I’m somewhat constrained by the limits of cube functions and my knowledge of MDX.
    
    I’m just really puzzled as to why the set I’ve created cannot be sliced by these additional dimensions, but the individual members (when broken out) can be!
    
    Loading...
  2. Chris Webb says:
    
    February 24, 2016 at 1:53 pm
    
    Yes, but that doesn’t mean you can’t do this in a measure – that’s going to be by far the easiest and fastest way of implementing this kind of calculation. You can then use the measure in any report, whether it’s an Excel report using cube functions or not.
    
    Loading...
richard moore says:

December 17, 2016 at 10:23 pm

Hi Chris,

How does creating an MDX set of Top 10 items compare to using the filter functionality in pivot tables / charts? Should I expect one to calculate faster?

I’m using Excel 2016, and have lots of pivot charts showing Top 10 customers etc.

Thanks for your content.
Rich

Loading...

Reply
1. Chris Webb says:
  
  December 17, 2016 at 11:01 pm
  
  In theory it should be about the same – you would use the TopCount() function in your named set and the PivotTable uses the same function. However there are lots of other things that could make a difference such as how the query is constructed, whether there are calculated members being used etc.
  
  Loading...
  
  Reply
  1. richard moore says:
    
    December 18, 2016 at 11:12 am
    
    Thanks for the speedy reply.
    
    I’ll do some testing and see if there’s any noticeable difference.
    
    Loading...
Richard says:

December 20, 2016 at 3:24 pm

Hi Chris,

I’m trying to build a set containing the last 12 months, so I can create a rolling last 12 months bank balances chart in Power Pivot.

I have a table that I bring in through Power Query with the ‘latest’ month, so it would have to reference that.

I’m brand new to MDX, is there any advice you can offer?

Thanks
Rich

Loading...

Reply
Ejaz says:

July 5, 2020 at 2:22 pm

Hi Chris. Amazing Post. I am stuck with using cuberankmembers with this Cubeset. I see that the dates themselves are being ordered in the example. What if I want to order dates by the measure’s value.

Loading...

Reply