The Binary.InferContentType M Function

The April 2018 release of Power BI Desktop included a new M function: Binary.InferContentType. There’s no online documentation for it yet but the built-in documentation is quite helpful:

image

I tested it out by pointing it at the following simple CSV file:

image

…and with the following M code:

let
    Source = File.Contents("C:\01 JanuarySales.csv"),
    Test = Binary.InferContentType(Source)
in
    Test

Got the following output:

image

It has successfully detected that it’s looking at a CSV file; the table in the lower half of the screenshot above is the table returned by the Csv.PotentialDelimiters field, and that shows that with a comma as a delimiter three columns can be found (my recent blog post on Csv.Document might also provide some useful context here).

I also pointed it at a few other file types such as JSON and XML and it successfully returned the correct MIME type, but interestingly when I changed the file extension of my JSON file to .txt it thought the file was a text/CSV file, so I guess it’s not that smart yet. I also could not get it to return the Csv.PotentialPositions field mentioned in the documentation for fixed width files so it may still be a work in progress…?

BI Survey 18

It’s that time again: the BI Survey (the world’s largest survey of BI tools and users) needs your input. Here’s the link to take part:

https://www.efs-survey.com/uc/BARC_GmbH/396b/?a=101

As a reward for participating you’ll get a summary of the results and be entered into a draw for some Amazon vouchers. As a reward for plugging the BI Survey here I get to see the full results and blog about them later on in the year, and the results are always fascinating. Last year Power BI was breathing down the necks of more established vendors like Tableau and Qlik; this year I expect Power BI to be in an even stronger position.

Using SSAS Multidimensional As A Data Source For Power BI (Video)

The nice people at PASS have made a video of my session on “Using SSAS MD as a data source for Power BI” available to view for free on YouTube:

I’m honoured that it’s listed one of their “Best of PASS Summit 2017” sessions, and there are lots of other great videos on the same page including Alberto Ferrari’s session on DAX optimisation.

Some of the tips in this video include a few things I’ve been meaning to blog about for a while, including how important it is to set the ValueColumn property on your dimension attributes in SSAS MD – it lets you use lots of functionality in Power BI that isn’t otherwise available, including date slicers.

Data Privacy Settings In Power BI/Power Query, Part 4: Disabling Data Privacy Checks

So far in this series, I have shown how changing the data privacy settings for a data source can affect the performance of queries and even prevent them from executing completely. What I haven’t mentioned yet is that you also have the option of disabling data privacy checks completely in Power BI Desktop and Excel. In this post I will show you how you can disable data privacy checks and discuss the pros and cons of doing so.

In Power BI Desktop you can change whether data privacy checks are applied when a query executes by going to File/Options And Settings and selecting Options:

image_thumb[7]

The same settings can be found in Excel 2016 by going to the Data tab, clicking Get Data and then selecting Query Options.

image

In both cases this brings up the Options dialog.

There are two panes in the Options dialog with properties that are relevant to how data privacy checks are applied. First of all, in Global/Privacy, there are global properties that are relevant for every .pbix or Excel file that you open on your PC:

image_thumb[6]

The three options here need a little bit of explanation:

  1. Always combine data according to your Privacy Level settings for each source means that data privacy settings are always applied for every .pbix or Excel file you open, regardless of the properties (described below) that you have saved for individual files.
  2. Combine data according to each file’s Privacy Level settings means that the properties set on individual .pbix or Excel files control how the data privacy checks are applied.
  3. Always ignore Privacy Level settings means that data privacy settings are always ignored, in every .pbix or Excel file you open, regardless of settings saved for individual files.

Then, in the Current File/Privacy pane, there are properties that are saved in and apply to the current .pbix or Excel file that you have open:

image

The radio buttons here are greyed out if you have options #1 or #3 selected in the previous pane; it’s only if you have selected option #2, Combine data according to each file’s Privacy Level settings, that these properties are taken into account. You may need to close and reopen the Options dialog if you have changed settings in the previous pane but the radio buttons here remain greyed out.

The two options here are:

  1. Combine data according to your Privacy Level settings for each source, which means that the data privacy settings that you have set for each data source are used to control how queries that combine data from multiple data sources behave. This is the default setting.
  2. Ignore the Privacy Levels and potentially improve performance, which means that data privacy settings are completely ignored when queries combine data from multiple data sources.

To sum up, these two groups of properties allow you to choose whether data privacy settings are applied differently for different .pbix or Excel files, or whether, on your PC, they are always applied or always ignored.

For Power BI users it is important to remember that these settings only apply to Power BI Desktop. After a report has been published, if you are using the On-Premises Data Gateway, you also need to configure data privacy settings on the data sources used by your dataset in the Power BI portal. If you are using the On-Premises Data Gateway in Personal Mode (what used to be called the Personal Gateway) then you can configure it to ignore data privacy settings as described here. Unfortunately if you are not using Personal Mode (ie you are using what used to be called the Enterprise Gateway, and what is now just called the On-Premises Data Gateway) then at the time of writing there is no way to configure the gateway to ignore data privacy levels. You can vote here to get this changed. It’s also worth mentioning that right now you can’t combine data from online and on-premises data sources in a gateway either, although it sounds like this limitation will be addressed soon. To work around these limitations you have to import data into separate tables in the dataset and then use DAX calculated tables to combine the data instead – a nasty hack I know, but one that I’ve had to implement myself a few times.

It can be incredibly tempting to avoid the problems associated with data privacy checks by setting Power BI and Excel to ignore them completely. Doing this certainly avoids a lot of headaches and confusion with the Formula.Firewall error message and so on. It also ensures that your queries execute as fast as they can: this is not just because query folding happens whenever possible but because the act of applying the data privacy checks alone can hurt query performance. Recently I saw a case where the only data source used was an Excel workbook (so no query folding was possible) and turning off the data privacy checks made a massive difference to query performance.

However, I cannot recommend that you turn off data privacy checks for all your Excel workbooks and .pbix files by default. Firstly, if you are working with sensitive or highly-regulated data, leaving the data privacy checks in place at least forces you to consider the privacy implications of query folding on a case-by-case basis. On the other hand ignoring data privacy checks by default makes it more likely that you or one of your users will create a query that accidentally sends data to an external data source and breaches your organisation’s rules – or even the law – concerning how this data should be handled. Secondly, if you are a Power BI user and need to use the On-Premises Data Gateway, then you risk creating reports that work fine in Power BI Desktop when the data privacy checks are ignored but which cannot be refreshed after they have been published because the On-Premises Gateway still applies those checks.

Converting Lists Of Numbers To Text Ranges In Power Query

In a comment on my post on Creating Sequences of Integers And Characters In Power BI/Power Query Lists a reader, Paul G, asked me the following question:

can you reverse this? e.g i have a list (1,2,3,5,7,8,9,12,13,14,15) can i convert this to (1-3, 5 ,7-9,12-15)

This got me thinking… I was sure it could be done in M, but would it be possible using just the UI? As far as I can see, it isn’t – there’s one crucial thing I can’t do – but I would be interested to see if anyone else can come up with a no-code solution.

The Almost UI-Only Solution

Here’s the source data, an Excel table called SourceTable:

image

Here’s what I did to create the ranges in a new Power Query query:

  1. Load the table into Power Query and set the Numbers column to have a data type of Whole Number
    image
  2. Sort the Numbers column in ascending order (even though in this case it already is)
    image
  3. Add an Index Column. It doesn’t matter what number it starts at (though I started it at one) but it does matter that it has a consecutive list of whole numbers in it:
    image
  4. Select both columns in the table, go to the Add Column tab in the query editor and under the Standard button select Subtract to add a new column containing the value of [Numbers]-[Index]:
    image
    Each distinct value in this column equates to a range of numbers in our final output – for example the rows that have the value 0 in this table will become the 1-3 range in the output.
  5. Do a Group By on this table, grouping by the Inserted Subtraction column and finding the minimum and maximum values in each range:
    imageimage
  6. Remove the Inserted Subtraction column because we don’t need it any more:
    image
  7. Select the Min and Max columns and then do a Merge Columns to combine these two sets of values using a dash:
    image
    image
  8. You now have your range names, but in the case of 5-5 you need to just return 5, so add a Conditional Column to handle this like so:
    image
    image
  9. Set the resulting column to have a data type of text:
    image
  10. Remove all other columns apart from Custom:
    image
  11. And finally, combine all the values together into a single comma-delimited list. This is where the UI lets you down, at the last step! Gil Raviv shows one way to solve this problem in a recent post, and then in a follow-up posts a much more elegant solution using M from Imke Feldmann which is my preferred approach too.  I won’t repost all the steps (which are very well described here anyway) but basically you do another Group By, then alter the code of the resulting step to use Text.Combine() to aggregate the text.
    image

Job done! If anyone from the Power Query team is reading this blog, here’s a message for you: please give us a new option in the Group By dialog to concatenate all the text values in a column, rather like the DAX ConcatenateX() function. Thanks!

The Pure M Solution

Of course you would like to see a pure M solution too, wouldn’t you? Here you go:

let
    //The list to find ranges in
    Source = {1,2,3,5,7,8,9,12,13,14,15},
    //Sort in ascending order
    SortedSource = List.Sort(Source, Order.Ascending),
    //Get a list of all the positions in the list
    //rather like adding an index column
    Positions = List.Positions(SortedSource),
    //Create a list of all the groups by subtracting
    //each number from its position in the list
    Groups = List.Transform(Positions, each SortedSource{_}-_),
    //The function to use with List.Accumulate
    RangeAccFunction = (state, current) =>
        let
            //Get the current number from the iteration
            CurrentNumber = Text.From(SortedSource{current}),
            //Get the current group number
            CurrentGroup = Groups{current},
            //Get the previous group number
            LastGroupIndex = if current=0 then 0 
                                else current-1,
            LastGroup = Groups{LastGroupIndex},
            //Get the next group number
            NextGroupIndex = if current=List.Max(Positions) 
                                then 0 
		                else current+1,
            NextGroup = Groups{NextGroupIndex},
            //Generate the text for this iteration
            TextToAdd = 
                if current=0 then CurrentNumber 
		 else
                if CurrentGroup<>LastGroup 
                 then ", " & CurrentNumber 
		 else
                if CurrentGroup<>NextGroup 
                 then "-" & CurrentNumber 
		 else ""
            in
                //Return the previous text plus 
	        //the newly generated text
                state & TextToAdd,
    //Use List.Accumulate to iterate over the list of positions
    //And generate the text for the ranges
    Output = List.Accumulate(Positions, "", RangeAccFunction)
            
in
    Output

The output in this case is just the text we’re looking for:

image

This turned out to be a great opportunity to use the List.Accumulate() function; once again, Gil Raviv has a great post describing this function and how it can be used. In this case I’m using a very similar approach to the one above to group the numbers, then declaring a function called RangeAccFunction that can be passed to List.Accumulate() to generate the output text. Again, I would be curious to see other solutions!

You can download the sample workbook for this post here.

Happy First Birthday Power BI!

To mark the first anniversary of Power BI reaching RTM, Paul Turley and Adam Saxton have organised a celebration in the form of coordinated blog posts from the community and a video to say thank you to the Power BI team at Microsoft:

This has been a great year for Power BI and its success is a direct result of all of the hard work that the team have put in. Speaking personally, I am incredibly grateful for all the help and advice that I get on a daily basis from individual Microsoft employees who are often providing it in their own time. My congratulations to James Phillips and to everyone who has worked on Power BI to make it what it is today!

BI Survey 15

It’s BI Survey time again! BI Survey is the largest annual survey of BI users in the world, so if you want to share your feelings on Microsoft BI tools or whatever else you’re using then this is the opportunity to do it. As in the past, in return for promoting the survey I get access to the results when they appear later in the year, and they always make for interesting reading and a good blog post. This year I’m curious to find out what people are saying about Power BI…

Anyway, if you do want to take part (it should only take 20 minutes and you’ll also be entered in a draw for some Amazon vouchers) then here’s the link:

https://digiumenterprise.com/answer/?link=2419-3RFFUGEB