Join Conditions In Power Query, Part 1

In last week’s post I showed how to create a simple LIKE function in Power Query which could be used in a calculated column. This week I’m going to show you how you can use this function in a condition when joining two tables together.

Consider the following two tables in an Excel worksheet:

image

Let’s say you want to join the Patterns table to the Phrases table, but only return the rows for each pattern where the Like() function returns true. Power Query has good support for different types of joins in the Table.Join() and Table.NestedJoin() functions but it isn’t immediately obvious how to handle join conditions such as this one.

Here’s the M code for a query that shows you how to do this:

let

    //Load Patterns table

    PatternsTable = Excel.CurrentWorkbook(){[Name="Patterns"]}[Content],

    //Load Phrases table

    PhrasesTable = Excel.CurrentWorkbook(){[Name="Phrases"]}[Content],

    //Add custom column to Patterns that returns the Phrases table for each row

    InsertedCustom = Table.AddColumn(PatternsTable, "AllPhrases", each PhrasesTable),

    //Expand the new column

    ExpandedTable = Table.ExpandTableColumn(InsertedCustom, "AllPhrases", {"Phrases"}, {"Phrases"}),

    //Filter the expanded table using the Like() function

    Custom1 = Table.SelectRows(ExpandedTable, each Like([Phrases],[Patterns]))

in

    Custom1

 

What this query does is the following:

  • Gets the data from both the Patterns table and the Phrases table. I’ve done this as two steps in the same query for simplicity; you might want to create two separate queries to do this.
  • On the Patterns table, add a new custom column that returns the entire Phrases table as a value. The Insert Custom Column dialog and its output will look like this:

image

  • Expands the new custom column so that every row of the Phrases table is displayed against every row of the Patterns table – in effect, it does a cross join between the two tables. Interestingly there is no option to do a cross join using the Table.Join() and Table.NestedJoin() functions.

image

  • Finally, it uses the Like() function in Table.SelectRows() as follows:

    Table.SelectRows(ExpandedTable, each Like([Phrases],[Patterns]))

    …to filter the table from the previous step so that you only get the rows back where the Like() function returns true and the pattern matches the phrase:

image

So… we have a solution to our problem (and you can download the example workbook here). However, be warned: for large tables this approach may not perform well! In part two I’ll show you another example of a complex join condition and show you how different approaches can yield very different performance.

22 thoughts on “Join Conditions In Power Query, Part 1

  1. Awesome, Chris
    Once we figure out how to use stuff like this, life for Excel Power users like me (30 is never going to be the same again!

  2. Hi Chris, I didn’t know where to ask this since I couldn’t find any blogpost with similar subject, I have table where I want to replace certain text in one of the column with with another text. I know I can do this one by one by using Table.ReplaceValue() but is there a way I can do this more efficiently using another table which contains two columns, first contains the TextToReplace and 2nd Contains the Replacement text. I have about 50 texts to replace. I’ve been pulling hair trying to accomplish this using function that iterate but haven’t been successful so far. let me know if this is possible.

      1. Thanks Chris, I’ve sent the email to info@.. email address

  3. Chris,
    Regarding your statement above:
    “Interestingly there is no option to do a cross join using the Table.Join() and Table.NestedJoin() functions.”

    Unless I’m mistaken, it seems you can leave the “key” arguments for both tables in the functions as empty lists “{}” in order to perform a cross join. For the NestedJoin, it seems the function will only add the join column if the 2nd table uses the Table.PrefixColumns function to prefix the column names, even if they’re not the same to begin with, which seems odd.

    Hope that helps.

  4. Hi can I ask: Can one join, in PowerQuery, an Excel table with a text file, before the result is loaded to the current Excel file?

      1. Thanks so much- That sounds great. Once can then translate columns from say a very large text file, using an Excel sheet with the mapping, before drawing it into Excel. Which would save a lot of space.

  5. Chris Hi, thanks for the article! Just one question, this line throws me an error: Table.SelectRows(ExpandedTable, each Like([Phrases],[Patterns]))
    The Like function changed or was deprecated?

  6. Hello Chris,
    I’m looking for solution for situation bellow.
    I have table with column with values. Want to duplicate this row same times as in column with value and increase value in another column
    As example:
    Initial table:
    Row 1 | 35 000 | 10 000 | 4
    Row 2 | 18 000 | 10 000 | 2
    Row 3 | 23 000 | 10 000 | 3
    Row 4 | 8 000 | 10 000 | 1
    Row 5 | 19 000 | 10 000 | 2

    Result Table
    Row 1 | 10 000 | 10 000 | 1
    Row 2 | 10 000 | 10 000 | 2
    Row 3 | 10 000 | 10 000 | 3
    Row 4 | 5 000 | 10 000 | 4
    Row 5 | 10 000 | 10 000 | 1
    Row 6 | 8 000 | 10 000 | 2
    Row 7 | 10 000 | 10 000 | 1
    Row 8 | 10 000 | 10 000 | 2
    Row 9 | 3 000 | 10 000 | 3
    Row 10 | 8 000 | 10 000 | 1
    Row 11 | 10 000 | 10 000 | 1
    Row 12 | 9 0000 | 10 000 | 2

    Do you think that will be possible to do that?

    Thank you in advance!
    Todor Todorov

  7. Hi Chris,
    Wonder if you find a way to separate values?

    Best regards
    Todor Todorov

      1. Hi Chris,
        Just remind about my previous post.
        Sorry for my poor English.

        Best regards
        Todor Todorov

  8. Hi Chris,

    I am trying to achieve conditional cross joint in M / Power Query.

    I have a Tasks table with following data:

    Task Name Start End
    A 01/08/2018 03/08/2018
    B 07/08/2018 17/08/2018
    C 21/08/2018 31/08/2018

    I have a Calendar table that spans 01/01/2018 to 31/12/2018.

    I wanted to get one line per day for a given task between Start and End Date.

    e.g.

    Calendar Date Task Name Start End
    01/08/2018 A 01/08/2018 03/08/2018
    02/08/2018 A 01/08/2018 03/08/2018
    03/08/2018 A 01/08/2018 03/08/2018

    I can do this in DAX by creating a new table with:

    Table =

    FILTER(

    CROSSJOIN(‘Project Calendar’, ‘Project Tasks’),

    ‘Project Tasks'[Start] = ‘Project Calendar'[Date]

    )

    How can I achieve the same using M / Power Query.

    I tried using Merge Queries but since I can only select one date field for Tasks table when merging with Calendar table, it does not give the desired results.

    Any help will be greatly appreciated.

    1. My suggestion is not to do this as a Merge, but instead create a custom column in M that contains a nested table value, where each row contains a table that has one row for each date from the start date to the end date. You could then expand the column and get the number of rows you need.

Leave a Reply to Chris WebbCancel reply