Splitting Text By Character Transition In Power BI And Excel Power Query

The February 2019 release of Power BI Desktop includes some new functionality not mentioned in the summary blog post, functionality that is already in the latest Excel 365 builds (thanks Ed Hansberry for pointing this out) and will be extremely useful. It takes the form of four new options under the Split Column button in the Query Editor:

image

The four options allow you to split text by:

  • Lowercase to uppercase
  • Uppercase to lowercase
  • Digit to non-digit
  • Non-digit to digit

They are intended to solve problems like the one Gil Raviv blogged about here where he shows how to split camel case text in M.

Here’s an example of the lowercase to uppercase split. With the following input table:

image

The output is:

image

Here’s the M code behind:

[sourcecode language=’text’ padlinenumbers=’true’]
let
Source =
#table(
{“MyData”},
{
{“CamelCase”},
{“DimDate”}
}
),
#”Split Column by Character Transition” =
Table.SplitColumn(
Source,
“MyData”,
Splitter.SplitTextByCharacterTransition(
{“a”..”z”},
{“A”..”Z”}),
{“MyData.1”, “MyData.2″}
)
in
#”Split Column by Character Transition”
[/sourcecode]

It turns out that the new Splitter function that’s behind this, Splitter.SplitTextByCharacterTransition, has been in Power BI for at least a month already (if you want some background on Splitter functions, see my blog post here), but I don’t think anyone noticed. This new function is very flexible, and I can see it will be useful for a lot more scenarios than just these new four options in the Query Editor: it allows you to split text when there is a transition from any one character or characters to any other character or characters.

10 thoughts on “Splitting Text By Character Transition In Power BI And Excel Power Query

  1. Thank you for the information! This is something that I just “struggled” with straight on the M Query side manually. Nice to see this feature now automated. It is odd that it was not included in the “Feature summary” blog post.

    Maybe Microsoft is hiding Easter eggs for users 🙂

  2. Splitter.SplitTextByCharacterTransition({“0”..”9″},each _>”9″)(“张三100李四25.5王五92.5”)

  3. Hi Chris,
    thanks a lot for for blog post. I followed your example and made some M-Code manually changes. The M-Code for the splitter part is:

    = Table.SplitColumn(#”Changed Typ”, “MyData”, Splitter.SplitTextByCharacterTransition({“a”}, {“m”,”t”, “T”, “s”, “e”}), {“MyData.1”, “MyData.2”, “MyData.3”})

    This functionality is very awesome and makes new fantastic transitions very easy.

    Frank

  4. Dear Chris,

    I have imported data into Power Query through connecting a folder. That folder houses 20 Excel files. Now when after combining the data therein, I am applying transformation steps – at each step it downloads the entire folder again. The process becomes too slow after I merges this query with other queries – at each step of transformation, Power Query editor downloads all the files.

    Am I missing something here? Is there a way out to apply transformation instantly and faster?

    Looking forward to your reply.

    Thank you,

    Sandip

    Get Outlook for Android

    ________________________________

Leave a Reply