Listing Windows Language Code Identifiers And Their Associated Date And Number Formats With M In Power BI/Power Query

In a comment on my blog post about international date and number formats and changing data types with the “using locale” option in Power Query/Power BI, Jan Karel Pieterse asked if there was any way to get a list of the thousand and decimal separators used for number formatting by each language and region. Since this is exactly the kind of geeky question that fascinates me I decided to write an M query to answer it and – for bonus points – to find the default date format used too.

To start off, I found a table of all Windows Language Code Identifiers on this page:

https://msdn.microsoft.com/en-us/library/cc233982.aspx

Of course this can be loaded into Power Query easily using the “From Web” source. After that it’s easy to add a column to the table that takes a sample date (March 22nd 2018) and number (one hundred thousand and one tenth) and converts it to text using the language code identifier on each row: the Text.From() function does this for dates, and for numbers you have to use Number.ToText() if you want to get thousand separators and decimal separators. There are a few minor problems to deal with, such as the fact that Power Query doesn’t know what to do with dates for the “Congo Swahili” language code identifier and some rows have multiple language tags, but nothing serious.

Here’s the full code:

let
    //Sample dates and numbers to show
    SampleDate = #date(2018,3,22),
    SampleNumber = 100000+(1/10),
    //MS web page with list of language tags
    LocaleWebPage =
        Web.Page(
        Web.Contents(
         "https://msdn.microsoft.com/en-us/library/cc233982.aspx"
         )
         ),
    LocaleList = LocaleWebPage{1}[Data],
    RemoveColumns =
        Table.SelectColumns(
            LocaleList,
            {"Language", "Location (or type)", "Language tag"}
            ),
    SplitColumn =
        Table.SplitColumn(
            RemoveColumns,
            "Language tag",
            Splitter.SplitTextByAnyDelimiter(
                {",","or"},
                QuoteStyle.Csv
                ),
                {"Language tag"}
                ),
    //Create example columns
    DateExample =
        Table.AddColumn(
            SplitColumn,
            "Date",
            each Text.From(SampleDate, [Language tag])
            , Text.Type),
    NumberExample =
        Table.AddColumn(
            DateExample,
            "Number",
            each Number.ToText(SampleNumber,"N", [Language tag])
            , Text.Type),
    //Remove any rows containing errors
    RemoveErrors = Table.RemoveRowsWithErrors(NumberExample)
in
    RemoveErrors

Here’s some of the output:

image

So, if you’ve ever wondered how the Cornish speakers of south-west England like to format their dates or whether the Oromo speakers of Ethiopia use a comma or a full stop as a decimal separator, wonder no more. And if you are not interested in M at all and just want to download an Excel workbook with a list of all LCIDs and how numbers and dates are formatted for them, you can do so here.

Azure Data Studio Should Support Analysis Services And Power BI Premium Capacities

I’m at the PASS Summit this week, and in this morning’s keynote there was a demo of the newly-released Azure Data Studio  – a modern, cross-platform tool for managing and querying SQL Server, Azure SQL Database and other Azure data services (it’s carefully described as “complementary to” SQL Server Management Studio rather than a replacement for it; this blog post has a detailed discussion of this question).

This video is provides a good, short overview of what it is:

I think it’s pretty cool, BUT… it doesn’t support Analysis Services. I had a moan about this and the generally poor state of Analysis Services tooling on Twitter, was invited to meet some of the developers and was told that if enough people request Analysis Services support it might happen.

What would support for Analysis Services involve? The following springs to mind:

  • I’d like to be able to connect to and manage Analysis Services Multidimensional and Tabular on-premises and Azure Analysis Services; if that’s too ambitious I could settle for supporting only Analysis Services Tabular 2016+ and Azure Analysis Services.
  • Since we will soon be able to connect to a Power BI Premium capacity as if it was an Analysis Services instance via XMLA endpoints, I would want to be able to connect to Power BI Premium capacity too.
  • I’d want to be able to run DAX and M queries, and ideally MDX queries too.
  • I would also want to be able to work with ASSL and TMSL for scripting and editing objects.
  • Azure Data Studio has a Profiler extension that works on xEvents; it would be great if that worked with Analysis Services xEvents too.
  • DAX and M Jupyter notebooks would be really useful!
  • It would make sense for some of the functionality of existing tools like DAX Studio and BISM Normalizer being turned into extensions.

If you want to see Analysis Services support in Azure Data Studio, go to the following issue on the Azure Data Studio GitHub repository:

https://github.com/Microsoft/azuredatastudio/issues/1026

…and click the thumbs-up icon on the first post:

AzureDataStudio

Let’s make our voices heard!

 

 

The Binary.InferContentType M Function

The April 2018 release of Power BI Desktop included a new M function: Binary.InferContentType. There’s no online documentation for it yet but the built-in documentation is quite helpful:

image

I tested it out by pointing it at the following simple CSV file:

image

…and with the following M code:

let
    Source = File.Contents("C:\01 JanuarySales.csv"),
    Test = Binary.InferContentType(Source)
in
    Test

Got the following output:

image

It has successfully detected that it’s looking at a CSV file; the table in the lower half of the screenshot above is the table returned by the Csv.PotentialDelimiters field, and that shows that with a comma as a delimiter three columns can be found (my recent blog post on Csv.Document might also provide some useful context here).

I also pointed it at a few other file types such as JSON and XML and it successfully returned the correct MIME type, but interestingly when I changed the file extension of my JSON file to .txt it thought the file was a text/CSV file, so I guess it’s not that smart yet. I also could not get it to return the Csv.PotentialPositions field mentioned in the documentation for fixed width files so it may still be a work in progress…?

BI Survey 18

It’s that time again: the BI Survey (the world’s largest survey of BI tools and users) needs your input. Here’s the link to take part:

https://www.efs-survey.com/uc/BARC_GmbH/396b/?a=101

As a reward for participating you’ll get a summary of the results and be entered into a draw for some Amazon vouchers. As a reward for plugging the BI Survey here I get to see the full results and blog about them later on in the year, and the results are always fascinating. Last year Power BI was breathing down the necks of more established vendors like Tableau and Qlik; this year I expect Power BI to be in an even stronger position.

Using SSAS Multidimensional As A Data Source For Power BI (Video)

The nice people at PASS have made a video of my session on “Using SSAS MD as a data source for Power BI” available to view for free on YouTube:

I’m honoured that it’s listed one of their “Best of PASS Summit 2017” sessions, and there are lots of other great videos on the same page including Alberto Ferrari’s session on DAX optimisation.

Some of the tips in this video include a few things I’ve been meaning to blog about for a while, including how important it is to set the ValueColumn property on your dimension attributes in SSAS MD – it lets you use lots of functionality in Power BI that isn’t otherwise available, including date slicers.

Data Privacy Settings In Power BI/Power Query, Part 4: Disabling Data Privacy Checks

So far in this series, I have shown how changing the data privacy settings for a data source can affect the performance of queries and even prevent them from executing completely. What I haven’t mentioned yet is that you also have the option of disabling data privacy checks completely in Power BI Desktop and Excel. In this post I will show you how you can disable data privacy checks and discuss the pros and cons of doing so.

In Power BI Desktop you can change whether data privacy checks are applied when a query executes by going to File/Options And Settings and selecting Options:

image_thumb[7]

The same settings can be found in Excel 2016 by going to the Data tab, clicking Get Data and then selecting Query Options.

image

In both cases this brings up the Options dialog.

There are two panes in the Options dialog with properties that are relevant to how data privacy checks are applied. First of all, in Global/Privacy, there are global properties that are relevant for every .pbix or Excel file that you open on your PC:

image_thumb[6]

The three options here need a little bit of explanation:

  1. Always combine data according to your Privacy Level settings for each source means that data privacy settings are always applied for every .pbix or Excel file you open, regardless of the properties (described below) that you have saved for individual files.
  2. Combine data according to each file’s Privacy Level settings means that the properties set on individual .pbix or Excel files control how the data privacy checks are applied.
  3. Always ignore Privacy Level settings means that data privacy settings are always ignored, in every .pbix or Excel file you open, regardless of settings saved for individual files.

Then, in the Current File/Privacy pane, there are properties that are saved in and apply to the current .pbix or Excel file that you have open:

image

The radio buttons here are greyed out if you have options #1 or #3 selected in the previous pane; it’s only if you have selected option #2, Combine data according to each file’s Privacy Level settings, that these properties are taken into account. You may need to close and reopen the Options dialog if you have changed settings in the previous pane but the radio buttons here remain greyed out.

The two options here are:

  1. Combine data according to your Privacy Level settings for each source, which means that the data privacy settings that you have set for each data source are used to control how queries that combine data from multiple data sources behave. This is the default setting.
  2. Ignore the Privacy Levels and potentially improve performance, which means that data privacy settings are completely ignored when queries combine data from multiple data sources.

To sum up, these two groups of properties allow you to choose whether data privacy settings are applied differently for different .pbix or Excel files, or whether, on your PC, they are always applied or always ignored.

For Power BI users it is important to remember that these settings only apply to Power BI Desktop. After a report has been published, if you are using the On-Premises Data Gateway, you also need to configure data privacy settings on the data sources used by your dataset in the Power BI portal. If you are using the On-Premises Data Gateway in Personal Mode (what used to be called the Personal Gateway) then you can configure it to ignore data privacy settings as described here. Unfortunately if you are not using Personal Mode (ie you are using what used to be called the Enterprise Gateway, and what is now just called the On-Premises Data Gateway) then at the time of writing there is no way to configure the gateway to ignore data privacy levels. You can vote here to get this changed. It’s also worth mentioning that right now you can’t combine data from online and on-premises data sources in a gateway either, although it sounds like this limitation will be addressed soon. To work around these limitations you have to import data into separate tables in the dataset and then use DAX calculated tables to combine the data instead – a nasty hack I know, but one that I’ve had to implement myself a few times.

It can be incredibly tempting to avoid the problems associated with data privacy checks by setting Power BI and Excel to ignore them completely. Doing this certainly avoids a lot of headaches and confusion with the Formula.Firewall error message and so on. It also ensures that your queries execute as fast as they can: this is not just because query folding happens whenever possible but because the act of applying the data privacy checks alone can hurt query performance. Recently I saw a case where the only data source used was an Excel workbook (so no query folding was possible) and turning off the data privacy checks made a massive difference to query performance.

However, I cannot recommend that you turn off data privacy checks for all your Excel workbooks and .pbix files by default. Firstly, if you are working with sensitive or highly-regulated data, leaving the data privacy checks in place at least forces you to consider the privacy implications of query folding on a case-by-case basis. On the other hand ignoring data privacy checks by default makes it more likely that you or one of your users will create a query that accidentally sends data to an external data source and breaches your organisation’s rules – or even the law – concerning how this data should be handled. Secondly, if you are a Power BI user and need to use the On-Premises Data Gateway, then you risk creating reports that work fine in Power BI Desktop when the data privacy checks are ignored but which cannot be refreshed after they have been published because the On-Premises Gateway still applies those checks.

Converting Lists Of Numbers To Text Ranges In Power Query

In a comment on my post on Creating Sequences of Integers And Characters In Power BI/Power Query Lists a reader, Paul G, asked me the following question:

can you reverse this? e.g i have a list (1,2,3,5,7,8,9,12,13,14,15) can i convert this to (1-3, 5 ,7-9,12-15)

This got me thinking… I was sure it could be done in M, but would it be possible using just the UI? As far as I can see, it isn’t – there’s one crucial thing I can’t do – but I would be interested to see if anyone else can come up with a no-code solution.

The Almost UI-Only Solution

Here’s the source data, an Excel table called SourceTable:

image

Here’s what I did to create the ranges in a new Power Query query:

  1. Load the table into Power Query and set the Numbers column to have a data type of Whole Number
    image
  2. Sort the Numbers column in ascending order (even though in this case it already is)
    image
  3. Add an Index Column. It doesn’t matter what number it starts at (though I started it at one) but it does matter that it has a consecutive list of whole numbers in it:
    image
  4. Select both columns in the table, go to the Add Column tab in the query editor and under the Standard button select Subtract to add a new column containing the value of [Numbers]-[Index]:
    image
    Each distinct value in this column equates to a range of numbers in our final output – for example the rows that have the value 0 in this table will become the 1-3 range in the output.
  5. Do a Group By on this table, grouping by the Inserted Subtraction column and finding the minimum and maximum values in each range:
    imageimage
  6. Remove the Inserted Subtraction column because we don’t need it any more:
    image
  7. Select the Min and Max columns and then do a Merge Columns to combine these two sets of values using a dash:
    image
    image
  8. You now have your range names, but in the case of 5-5 you need to just return 5, so add a Conditional Column to handle this like so:
    image
    image
  9. Set the resulting column to have a data type of text:
    image
  10. Remove all other columns apart from Custom:
    image
  11. And finally, combine all the values together into a single comma-delimited list. This is where the UI lets you down, at the last step! Gil Raviv shows one way to solve this problem in a recent post, and then in a follow-up posts a much more elegant solution using M from Imke Feldmann which is my preferred approach too.  I won’t repost all the steps (which are very well described here anyway) but basically you do another Group By, then alter the code of the resulting step to use Text.Combine() to aggregate the text.
    image

Job done! If anyone from the Power Query team is reading this blog, here’s a message for you: please give us a new option in the Group By dialog to concatenate all the text values in a column, rather like the DAX ConcatenateX() function. Thanks!

The Pure M Solution

Of course you would like to see a pure M solution too, wouldn’t you? Here you go:

let
    //The list to find ranges in
    Source = {1,2,3,5,7,8,9,12,13,14,15},
    //Sort in ascending order
    SortedSource = List.Sort(Source, Order.Ascending),
    //Get a list of all the positions in the list
    //rather like adding an index column
    Positions = List.Positions(SortedSource),
    //Create a list of all the groups by subtracting
    //each number from its position in the list
    Groups = List.Transform(Positions, each SortedSource{_}-_),
    //The function to use with List.Accumulate
    RangeAccFunction = (state, current) =>
        let
            //Get the current number from the iteration
            CurrentNumber = Text.From(SortedSource{current}),
            //Get the current group number
            CurrentGroup = Groups{current},
            //Get the previous group number
            LastGroupIndex = if current=0 then 0 
                                else current-1,
            LastGroup = Groups{LastGroupIndex},
            //Get the next group number
            NextGroupIndex = if current=List.Max(Positions) 
                                then 0 
		                else current+1,
            NextGroup = Groups{NextGroupIndex},
            //Generate the text for this iteration
            TextToAdd = 
                if current=0 then CurrentNumber 
		 else
                if CurrentGroup<>LastGroup 
                 then ", " & CurrentNumber 
		 else
                if CurrentGroup<>NextGroup 
                 then "-" & CurrentNumber 
		 else ""
            in
                //Return the previous text plus 
	        //the newly generated text
                state & TextToAdd,
    //Use List.Accumulate to iterate over the list of positions
    //And generate the text for the ranges
    Output = List.Accumulate(Positions, "", RangeAccFunction)
            
in
    Output

The output in this case is just the text we’re looking for:

image

This turned out to be a great opportunity to use the List.Accumulate() function; once again, Gil Raviv has a great post describing this function and how it can be used. In this case I’m using a very similar approach to the one above to group the numbers, then declaring a function called RangeAccFunction that can be passed to List.Accumulate() to generate the output text. Again, I would be curious to see other solutions!

You can download the sample workbook for this post here.

%d bloggers like this: