Power BI And Case Sensitivity

Case sensitivity is one of the more confusing aspects of Power BI: while the Power Query engine is case sensitive, the main Power BI engine (that means datasets, relationships, DAX etc) is case insensitive. Imke Feldmann shows how to make Power Query case insensitive through custom M here; in this blog post I’m going to concentrate on what case insensitivity means in datasets and DAX and show a way to (kind of) work around it.

Starting with a simple example, let’s say you have a data source that contains data that looks like this:

image

This is what the data looks like in the Power Query Editor which is, as I said, case sensitive – which means that it sees “A” and “a” as different characters, so you see six distinct characters in each row of the table. This could be data from any data source; to make things easy I’ve used the following M query to generate this table:

let
    Source = 
    #table(
        type table 
        [MyTextColumn = text], 
        {
            {"A"},
            {"a"},
            {"B"},
            {"b"},
            {"C"},
            {"c"}
        })
in
    Source

When this data gets loaded into a Power BI dataset, however, you’ll see the following in the Data pane of the main Power BI window:

image

Because the main Power BI engine is case insensitive – so for example “a” and “A” are considered as the same character – when this data is loaded in, it only sees three distinct characters and you can’t be sure whether you’ll get the lower case or upper case character stored twice. This is just the way it works; while an instance of Analysis Services Tabular, which is basically the same engine that is found in Power BI, can be either case sensitive or case insensitive, this option isn’t available in Power BI. There is an idea to support case sensitivity as an option in Power BI here and while there are some valid scenarios where you need this, in my experience case sensitivity causes a lot of problems so I can see why it wasn’t a priority.

What’s more you’ll find that DAX calculations are case insensitive too. If you try the following DAX expression to create a calculated table:

Demo Table = 
DATATABLE(
    "MyTextColumn",
    STRING,
    {
        {"A"},
        {"a"}
    }
)

You’ll get this:

image

DAX measures behave in a similar way. Given the following table:

image

If you use the following measure in a table visual:

Demo Measure = 
IF(
    SELECTEDVALUE('Demo Table 2'[MyIntColumn])=1,
    "A",
    "a"
)

…you’ll see:

image

The only way you can work around this case insensitivity is to make text values that would otherwise look the same to the Power BI engine somehow different. One way of doing this would be to add some extra characters to your text. You might think adding some extra spaces would be the way to go; revisiting the first M query shown above, you could add a space to every lower case character in the table like so:

let
    Source = 
    #table(
        type table 
        [MyTextColumn = text], 
        {
            {"A"},
            {"a "},
            {"B"},
            {"b "},
            {"c "},
            {"C"}
        })
in
    Source

But this doesn’t work because another little-known behaviour of the Power BI engine is that all leading and trailing spaces are trimmed when text is loaded. Anyway, spaces may not be visible but they still take up… well space. A better option – and one that actually works – is to use the Unicode Zero-Width Space character and the M Character.FromNumber function instead like so:

let
    Source = 
    #table(
        type table 
        [MyTextColumn = text], 
        {
            {"A"},
            {"a" & Character.FromNumber(8203)},
            {"B"},
            {"b" & Character.FromNumber(8203)},
            {"c" & Character.FromNumber(8203)},
            {"C"}
        })
in
    Source

The great thing about this character is that although it’s there, it’s invisible and takes no space. When you load this last M query into Power BI you see the following table:

image

Let’s finish off with a more advanced example of how to use this rather excellent character. Say you have the following source data returned by an M query called SourceData:

image

Of course, when this table is loaded into Power BI, you see the following:

image

BUT, the following M query takes this table and for each row goes through the text in the OriginalText column and adds a zero-width space after each lower case character:

let
    Source = 
    SourceData,
    ToList = 
    Table.AddColumn(
        Source, 
        "Chars", 
        each Text.ToList([OriginalText])
        ),
    LowerCaseChars = 
    {"a".."z"},
    AddInvisibleChars = 
    Table.AddColumn(
        ToList, 
        "AddInvisibleChars", 
        each 
        List.Transform(
            [Chars], 
            each 
            if 
            List.Contains(LowerCaseChars, _) 
            then 
            _ & Character.FromNumber(8203) 
            else _
            )
            ),
    RecombineList = 
    Table.AddColumn(
        AddInvisibleChars, 
        "OutputText", 
        each 
        Text.Combine([AddInvisibleChars]), 
        type text
        ),
    RemovedOtherColumns = 
    Table.SelectColumns(
        RecombineList,
        {"OutputText"}
        )

in
    RemovedOtherColumns

When you load the output of this query into Power BI, because the zero-width spaces after each lower case character make Power BI see each piece of text as being different, it looks like case has been preserved even when it hasn’t:

image

You can download the Power BI file for this last example here.

[Thanks to Greg Galloway, Akshai Mirchandani and Jeffrey Wang for their help with this post]

9 responses

  1. Pingback: Power BI and Case Sensitivity – Curated SQL

  2. Great post — I didn’t know that “A” and “a” become the same value when loaded!

    I noticed that sometimes Power Query is case-insensitive: when query folding takes place and the underlying data source is case-insensitive.

    DAX is sometimes case-sensitive — most of the time it’s clear when this is so: FIND and SEARCH perform the same function and are case-sensitive and case-insensitive, respectively.

  3. Hi Chris,
    I have just the opposite problem…
    I’ve created the different “Level” columns (15) in an SSAS Tabular model, from a Parent Child hierarchy. I need to sort TOTAL ASSETS and Total Assets as different/unique values in Level1Name. The GL Account Number is different for each, thus the error. This has caused quite a headache as I keep getting the duplicate value error in my Tabular model. We’re trying not to append a ‘1’ to values as this is what the client will see.
    I reviewed the solution above, but this won’t work with a table as large as ours. We don’t have the option to apply sorting in Power BI.

    Thoughts?

    BW

  4. Hi Chris

    If I create a column with many leading spaces and load them to Desktop, the Data view seems to show them all, but the table visual seems to show none, but if I do something like this

    appendcol = “aa” & ‘text with both spaces'[Message] & “zz”

    and put them into a table visual, I can see a consistent gap between the “aa” and the text for data that had leading spaces, but not for the others.

    Here is my data.
    Message
    This message does have spaces 1
    This message doesn’t have any 2
    This message does have spaces 3
    This message doesn’t have any 4
    This message does have spaces 5
    This message doesn’t have any 6

    Also, with this data, when you sort ascending the appendcol, it puts the them in this order (5,3,1,2,4,6) and when I export the data as a CSV from the visual this is the output.

    Message,appendcol
    This message does have spaces 5,aa This message does have spaces 5zz
    This message does have spaces 3,aa This message does have spaces 3zz
    This message does have spaces 1,aa This message does have spaces 1zz
    This message doesn’t have any 2,aaThis message doesn’t have any 2zz
    This message doesn’t have any 4,aaThis message doesn’t have any 4zz
    This message doesn’t have any 6,aaThis message doesn’t have any 6zz

    As far as I can see, leading spaces don’t get completely trimmed.
    Am I doing something wrong?

    Many thanks
    Oliver

  5. Hi Chris

    If I create a column with many leading spaces and load them to Desktop, the Data view seems to show them all, but the table visual seems to show none, but if I do something like this

    appendcol = “aa” & ‘text with both spaces'[Message] & “zz”

    and put them into a table visual, I can see a consistent gap between the “aa” and the text for data that had leading spaces, but not for the others.

    Here is my data.

    ID,Message
    1, This message does have spaces 1
    2,This message doesn’t have any 2
    3, This message does have spaces 3
    4,This message doesn’t have any 4
    5, This message does have spaces 5
    6,This message doesn’t have any 6

    Also, with this data, when you sort ascending the appendcol, it puts the them in this order (5,3,1,2,4,6) and when I export the data as a CSV from the visual this is the output.

    ID,Message,appendcol
    5, This message does have spaces 5,aa This message does have spaces 5zz
    3, This message does have spaces 3,aa This message does have spaces 3zz
    1, This message does have spaces 1,aa This message does have spaces 1zz
    2,This message doesn’t have any 2,aaThis message doesn’t have any 2zz
    4,This message doesn’t have any 4,aaThis message doesn’t have any 4zz
    6,This message doesn’t have any 6,aaThis message doesn’t have any 6zz

    As far as I can see, leading spaces don’t get completely trimmed.
    Am I doing something wrong?

    Many thanks
    Oliver

  6. Hi Chris

    If I create a column with many leading spaces and load them to Desktop, the Data view seems to show them all, but the table visual seems to show none.

    Here is my data.

    ID,type,c1,c2,c3,c4,c5
    1,has space, a , b , c , d , e
    2,no space,f,g,h,i,j
    3,has space, k , l , m , n , o
    4,no space,p,q,r,s,t
    5,has space, u , v , w , x , y

    If I do something like this

    c12 = “*” & ‘text with spaces 2′[c1] & “*”
    c22 = “*” & ‘text with spaces 2′[c2] & “*”
    c32 = “*” & ‘text with spaces 2′[c3] & “*”
    c42 = “*” & ‘text with spaces 2′[c4] & “*”
    c52 = “*” & ‘text with spaces 2′[c5] & “*”

    and put them into a table visual, I can see a consistent gap between the first “*” and the text for data that had leading spaces, but not for the others.

    When I export the data as a CSV from the visual this is the output.

    ID,type,c1,c2,c3,c4,c5,c12,c22,c32,c42,c52
    1,has space, a, b, c, d, e,* a*,* b*,* c*,* d*,* e*
    2,no space,f,g,h,i,j,*f*,*g*,*h*,*i*,*j*
    3,has space, k, l, m, n, o,* k*,* l*,* m*,* n*,* o*
    4,no space,p,q,r,s,t,*p*,*q*,*r*,*s*,*t*
    5,has space, u, v, w, x, y,* u*,* v*,* w*,* x*,* y*

    As far as I can see, trailing spaces do get trimmed, but leading spaces don’t get completely trimmed.
    Am I doing something wrong?

    Many thanks
    Oliver

  7. Hello, Chris.

    When you put an “&” in your code instead of plain “&” it looks a little confusing. When I see a semicolon inside M code – I start looking for “section” immediately =)

    Best regards

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: