Last week someone asked me whether it was possible to do the equivalent of a SQL LIKE filter in Power Query. Unfortunately there isn’t a function to do this in the standard library but, as always, it is possible to write some M code to do this. Here’s what I came up while I was waiting around at the stables during my daughter’s horse-riding lesson. At the moment it only supports the % wildcard character; also I can’t guarantee that it’s the most efficient implementation or indeed 100% bug-free, but it seems to work fine as far as I can see…
let
Like = (Phrase as text, Pattern as text) =>
let
//Split pattern up into a list using % as a delimiter
PatternList = Text.Split(Pattern, "%"),
//if the first character in the pattern is %
//then the first item in the list is an empty string
StartsWithWc = (List.First(PatternList)=""),
//if the last character in the pattern is %
//then the last item in the list is an empty string
EndsWithWc = (List.Last(PatternList)=""),
//if the first character is not %
//then we have to match the first string in the pattern
//with the opening characters of the phrase
StartsTest = if (StartsWithWc=false)
then Text.StartsWith(Phrase, List.First(PatternList))
else true,
//if the last item is not %
//then we have to match the final string in the pattern
//with the final characters of the phrase
EndsText = if (EndsWithWc=false)
then Text.EndsWith(Phrase, List.Last(PatternList))
else true,
//now we also need to check that each string in the pattern appears
//in the correct order in the phrase
//and to do this we need to declare a function PhraseFind
PhraseFind = (Phrase as text, SearchString as list) =>
let
//does the first string in the pattern appear in the phrase?
StringPos = Text.PositionOf(Phrase, SearchString{0}, Occurrence.First),
PhraseFindOutput =
if
//if string not find then return false
(StringPos=-1)
then false
else if
//we have found the string in the pattern, and
//if this is the last string in the pattern, return true
List.Count(SearchString)=1
then true
else
//if it isn't the last string in the pattern
//test the next string in the pattern by removing
//the first string from the pattern list
//and all text up to and including the string we have found in the phrase
(true and
@PhraseFind(
Text.RemoveRange(Phrase, 0, StringPos + Text.Length(SearchString{0})),
List.RemoveRange(SearchString, 0, 1)))
in
PhraseFindOutput,
//return true if we have passed all tests
Output = StartsTest and EndsText and PhraseFind(Phrase, PatternList)
in
Output
in
Like
Using the following test data:
I can run the following query:
let
Source = Excel.CurrentWorkbook(){[Name="Phrases"]}[Content],
ChangedType = Table.TransformColumnTypes(Source,{{"Phrases", type text}}),
InsertedCustom = Table.AddColumn(ChangedType, "Test", each Like([Phrases],"%cat%sat%mat%"))
in
InsertedCustom
And get this output:
You can download the sample workbook here.
I know the Power Query team have been asked for this several times already, but it would be really useful if we could package up functions like this and make it easy to share them publicly with other Power Query users…
Pingback: Dew Drop – May 28, 2014 (#1785) | Morning Dew
Pingback: Join Conditions In Power Query, Part 1 | Chris Webb's BI Blog
Pingback: Join Conditions In Power Query, Part 1 - SQL Server - SQL Server - Toad World
Pingback: daily 05/26/2015 | Cshonea's Blog
Nice solution, but is it possible for your “Like = (Phrase as text, Pattern as text) =>” function to use two different data sources for the inputs? This seems like a partial solution to the SQL like function.
Yes, absolutely – the parameter values can come from anywhere.
Hi Chris, really excellent read.
I’m testing a variation of this, but am hitting a wall. I’m trying to take a table of terms to see if any of them match anywhere in the text strings of a column in Power Query. Using your Like Query, I can match to one term, but not multiple terms (and only exact match to the string, not broad matching to any part of the string).
Have you ever tried to solve this approach?
The pertinent code working (for single term exact, but not broad matching): Table.AddColumn(#”Renamed Columns”, “Brand”, each Like([Phrase],”%car%”))
And not working for multiple phrases (and also not broad matching, of course): Table.AddColumn(#”Renamed Columns”, “Brand”, each Like([Phrase],”%car%red car%”))
Really love the blog!
C
It sounds like a function using List.Generate(), similar to the one shown here:
https://blog.crossjoin.co.uk/2014/06/25/using-list-generate-to-make-multiple-replacements-of-words-in-text-in-power-query/
Would allow you to loop over a list of multiple terms.
Chris – Please show where the argument for PhraseFile function SearchString is obtained? I can’t see it. Thanks.
I call the PhraseFind function in the Output step, towards the end of the query.
Thanks for this, it’s very nice.
I’ve just hacked this to use a pattern table containing a list of patterns.
I merge and expand my patterns table (cartesian join) to the data table I’m matching. I then run your function passing it the columns containing the phrase and the pattern as parameters.
It’s not even slow…
https://www.dropbox.com/s/53bwjh3s8bjei64/PQLikeFilterV2.xlsx?dl=1
Pingback: Power Query - Filter with wildcards - Excel Off The Grid
Hi Chris, Excellent Article,
I would to know in 2020, if the Power Query team have been package up functions like this and make it easy to share them publicly with other Power Query users
Thanks
No, there’s no way to do this yet, sorry
How would you pass a list of phrases to the function, check each of them in order, and then exit after the first match so that it doesn’t keep evaluating?