Multiselect, Filtering And Functions In Power Query

If you’re a Power Query enthusiast you’re probably already comfortable with creating functions and passing values to them. However in some scenarios you don’t want to pass just a single value to a parameter, you want to pass multiple values – for example if you are filtering a table by multiple criteria. What’s the best way of handling this in Power Query?

Imagine that you wanted to import data from the DimDate table in the SQL Server Adventure Works DW database. It’s a pretty straightforward date dimension table as you can see:

image

Imagine also that you didn’t want to import all the rows from the table but just those for certain days of the week that the user selects (filtering on the EnglishDayNameOfWeek column).

The first problem is, then, how do you allow the user to make this selection in a friendly way? I’ve already blogged about how the function parameter dialog can be made to show ‘allowed’ selections (here and here) but this only allows selection of single values. One solution I’ve used is to create an Excel table – sourced from a Power Query query of course – and then let users select from there.

In this case, the following query can be used to get all the distinct day names:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = Table.SelectColumns(dbo_DimDate,
																							{"DayNumberOfWeek", "EnglishDayNameOfWeek"}),
    #"Sorted Rows" = Table.Sort(#"Removed Other Columns",
                                    {{"DayNumberOfWeek", Order.Ascending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows"),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",
																			{"DayNumberOfWeek"}),
    #"Added Custom" = Table.AddColumn(#"Removed Columns", "Selected", each "No")
in
    #"Added Custom"

Nothing much interesting to say about this apart from that it was all created in the UI, it shows the day names in the correct order, and it has an extra column called Selected that always contains the value “No”. The output table in Excel looks like this:

image

The Selected column is going to allow the end user to choose which days of the week they want to filter the main table by. Since “Yes” and “No” are going to be the only valid values in this column you can use Excel’s Data Validation functionality to show a dropdown box in all of the cells in this column that allows the user from selecting one of those two values and nothing else.

image

image

Once the user has selected “Yes” against all of the day names they want to filter by in the Excel table, the next step is to use this table as the source for another Power Query query. To be clear, we’ve used Power Query to load a table containing day names into an Excel table, where the user can select which days they want to filter by, and we then load this data back into Power Query. This second query (called SelectedDays in this example) then just needs to filter the table so it only returns the rows where Selected is “Yes” and then removes the Selected column once it has done that:

let
    Source = Excel.CurrentWorkbook(){[Name="DistinctDates"]}[Content],
    #"Filtered Rows" = Table.SelectRows(Source, each ([Selected] = "Yes")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Selected"})
in
    #"Removed Columns"

image

This query doesn’t need to be loaded anywhere – but it will be referenced later.

With that done, you need to create a function to filter the DimDate table. Here’s the M code:

(SelectedDays as list) =>
let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Filtered Rows" = Table.SelectRows(dbo_DimDate, 
                             each List.Contains(SelectedDays,[EnglishDayNameOfWeek]) ) 
in
    #"Filtered Rows"

The thing to notice here is the condition used in the Table.SelectRows() function, where List.Contains() is used to check whether the day name of the current row is present in the list passed in through the SelectedDays parameter.

The final step is to invoke this function and pass the column from the query containing the selected days to it. There is a bit of UI sugar when you invoke a function with a parameter of type list that I blogged about recently. In this case when you invoke the function you just have to pass the pass it the EnglishDayNameOfWeek column from the SelectedDays query.

image

Here’s what the code for the query that invokes the function looks like:

let
    Source = DimDate(SelectedDays[EnglishDayNameOfWeek])
in
    Source

And of course, when you run the query and output the results to a table, you get the DimDate table filtered by all of the days of the week you have selected:

image

To change the output the user just needs to change the selected days and then refresh this last query.

In case you’re wondering, this query does get folded back to SQL Server too. Here’s the SQL generated by Power Query:

select [$Ordered].[DateKey],
    [$Ordered].[FullDateAlternateKey],
    [$Ordered].[DayNumberOfWeek],
    [$Ordered].[EnglishDayNameOfWeek],
    [$Ordered].[SpanishDayNameOfWeek],
    [$Ordered].[FrenchDayNameOfWeek],
    [$Ordered].[DayNumberOfMonth],
    [$Ordered].[DayNumberOfYear],
    [$Ordered].[WeekNumberOfYear],
    [$Ordered].[EnglishMonthName],
    [$Ordered].[SpanishMonthName],
    [$Ordered].[FrenchMonthName],
    [$Ordered].[MonthNumberOfYear],
    [$Ordered].[CalendarQuarter],
    [$Ordered].[CalendarYear],
    [$Ordered].[CalendarSemester],
    [$Ordered].[FiscalQuarter],
    [$Ordered].[FiscalYear],
    [$Ordered].[FiscalSemester]
from 
(
    select [_].[DateKey],
        [_].[FullDateAlternateKey],
        [_].[DayNumberOfWeek],
        [_].[EnglishDayNameOfWeek],
        [_].[SpanishDayNameOfWeek],
        [_].[FrenchDayNameOfWeek],
        [_].[DayNumberOfMonth],
        [_].[DayNumberOfYear],
        [_].[WeekNumberOfYear],
        [_].[EnglishMonthName],
        [_].[SpanishMonthName],
        [_].[FrenchMonthName],
        [_].[MonthNumberOfYear],
        [_].[CalendarQuarter],
        [_].[CalendarYear],
        [_].[CalendarSemester],
        [_].[FiscalQuarter],
        [_].[FiscalYear],
        [_].[FiscalSemester]
    from [dbo].[DimDate] as [_]
    where [_].[EnglishDayNameOfWeek] in ('Monday', 'Wednesday', 'Friday')
) as [$Ordered]
order by [$Ordered].[DateKey]

Notice that the Where clause contains an IN condition with all of the selected days.

You can download the example workbook for this post here.

17 thoughts on “Multiselect, Filtering And Functions In Power Query

  1. This is great stuff. However, I added this to my power query that loads to the data model to help stop redundant data going into it (I don’t want plan and forecast, just one or the other) The problem I have now is that it is taking an eternity to load to the data model whereas before it was very quick. I have about 27000 rows of data so I’m guessing the custom function is being invoked for every single row to adverse effect?

    • What data source are you using? If you’re using SQL Server, have you run a Profiler trace to see if query folding is taking place? If query folding has stopped, that might explain why performance is so bad.

      • Hi, just a csv file
        I was importing data from a worksheet but converted it to csv which made it far slicker
        It feels like the parameterised function which obtains my list of filter values is the culprit. The function references a small table in the same manner as your example
        When you edit the query and step through it is fast but as soon as you load to data model it slows up massively

      • It could well be that it just performs badly on a csv file, as opposed to SQL Server where the filter can be passed back to the database. Have you tried using Table.Buffer() somewhere before the filtering takes place? Another approach would be to do an inner join between the tables using Table.Join, rather than use a function.

        I also wonder whether this is relevant, even though it seems just to be related to query folding: http://www.thebiccountant.com/2015/09/17/filter-sql-server-queries-with-excel-tables-query-folding-limitations-and-bug-workaround/

      • Table.buffer is a mystery to me. As I thought it was only of use when interfacing with SQL I took it out. I did initially have it sprinkled everywhere. I’m going to put it back in and see what happens
        BTW are you planning on writing any other books as I wouldn’t have got up-to-speed without your first?

      • I’m not planning on updating my Power Query book at the moment, no. Everything’s changing too fast – it would be impossible to write a book that wasn’t completely out-of-date by the time it’s published.

      • No problem, understand. What I’m after is a book which gets me writing and thinking about M code as instinctively as I do with VBA and there is a definite space for this
        I’ve invested a lot of effort in Power BI specifically Data Model & Power Query so I hope that Microsoft evolve and embed these functionalities further into Excel (certainly looks that way going by 2016 with Get & Transform)

  2. This changed made all the difference:

    ActiveDataDescList = fDataDescList(“DimSalesDesc”, “Sales_datadesc”), // pass excel table name & filter column name in source data
    FilteredTable = Table.SelectRows(ChangedType, each List.Contains(ActiveDataDescList, [Sales_datadesc] )),
    BufferedTable = Table.Buffer(FilteredTable) ///// THIS MADE THE DIFFERENCE !!!
    in
    BufferedTable

    Although it has slowed the query down a touch, at least I can see things happening and the data being loaded in. Without it you are just stuck with ‘Loading’ for a long period

    So I now have a solution to getting rid of redundant data at a small performance cost

  3. Approximate match for List,Contains? The language spec identifies a “Equation criteria” parameter, but I cannot decipher it. Would it allow a “begins with” match for List.Contains?

  4. Thank you! You helped me a lot. But I was having a problem with loading and updating my query:

    I had 2 queries, one for my table, and another for the filter options, this second one I was taking only the list using the field name that I wanted, like this:

    QueryToExcelFilterTable
    “`
    = Excel.CurrentWorkbook(){[Name=”filterProduct”]}[Content][Product]
    “`

    So, I was passing the list as an argument to List.Contains:
    “`
    = Table.SelectRows(PreviousStepRepresentingTable, each List.Contains(QueryToExcelFilterTable, [Product])
    “`

    It was loading only the 200 first rows, kept updating with no end. Just as an experiment a tried to write my list literaly, so `List.Contains({“Prod1″,”Prod2″,”Etc”}, [Product])`, and it worked perfectly, loading 354 rows.

    To fix that I’ve used the function List.Buffer passing the list as argument and again, it worked.
    I just changed the QueryToExcelFilterTable to:
    “`
    = List.Buffer( Excel.CurrentWorkbook(){[Name=”filterProduct”]}[Content][Product]) )
    “`

    If someone has the same problem, I hope it can be usefull.

  5. Hi Chris,

    I’m trying (and failing) to do almost the opposite of this. In other words, I have a table of values that I want to exclude from the main data table rather than include. Ideally, I don’t want to have to select each row to keep but simply add a new row to an override table to exclude it from the main table. I can’t find anything like List.DoesNotContain. Any ldeas?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s