Data Privacy Settings In Power BI/Power Query, Part 5: The Inheritance Of Data Privacy Settings And The None Data Privacy Level

Something I didn’t understand at all when I started writing this series was how the “None” data privacy level worked. Now, however, the ever- helpful Curt Hagenlocher of the Power Query dev team has explained it to me and in this post I’ll demonstrate how it behaves and show how data privacy levels can be inherited from other data sources.

Let’s go back to the original example I used in part 1 of this series where I showed how data from an Excel workbook can be combined with data from SQL Server, and how the data privacy settings on each data source determine whether query folding takes place or not (I suggest you read that post before continuing to get some background). Now, imagine that the Excel workbook is in a folder called C:\Data Privacy Demo, and a query called FilterDay is used to get data from it:

let
    Source = 
	Excel.Workbook(
		File.Contents(
		"C:\Data Privacy Demo\FilterParameter.xlsx"
		)
	, null, true),
    FilterDay_Table = 
	Source{[Item="FilterDay",Kind="Table"]}[Data],
    ChangedType = 
	Table.TransformColumnTypes(
		FilterDay_Table,
		{{"Parameter", type text}}
	),
    Output = 
	ChangedType{0}[#"Parameter"]
in
    Output

This query gets the name of a weekday from a table in the workbook, for example the text “Friday”:

image

When this query is referenced in a second query that uses the day name to filter the data in a table in SQL Server, like so:

let
    Source = Sql.Databases("localhost"),
    DB = Source{[Name="Adventure Works DW"]}[Data],
    dbo_DimDate = DB{[Schema="dbo",Item="DimDate"]}[Data],
    RemovedColumns = Table.SelectColumns(dbo_DimDate,
        {"DateKey", "EnglishDayNameOfWeek"}),
    FilteredRows = Table.SelectRows(RemovedColumns, 
        each ([EnglishDayNameOfWeek] = FilterDay))
in
    FilteredRows

…and the query is run for the first time, then you will get prompted for credentials to access SQL Server and after that you’ll get prompted to set data privacy levels on both data sources used:

image

The dropdown boxes in the second column allow you to set the data privacy settings for each data source, but look at the data sources listed in the first column. There are two things to point out:

  • The data sources the two queries are accessing are the DimDate table in the Adventure Works DW database on localhost, and the file C:\Data Privacy Demo\FilterParameter.xlsx. However you’re not being prompted to set data privacy levels on those exact data sources, you’re being prompted to set data privacy levels on the localhost instance and the c:\ drive
  • The data source names are displayed in dropdown boxes, so there are other options to select here

Clicking each dropdown box is revealing:

image

image

For the SQL Server database you can set the data privacy level at two places: the localhost instance (the default), or the Adventure Works DW database on that instance. For the Excel workbook you get set the data privacy level at three places: the c:\ drive (the default), the folder c:\Data Privacy Demo that the Excel workbook is in, or the Excel workbook itself.

Let’s say you accept the defaults and set the data privacy settings to Public on localhost and the c:\ drive:

image

As you would expect after reading part 1 of this series, the query runs and query folding takes place:

image

image

Now, let’s say you copy the Excel file up to the root of the c:\ drive and rename it to filterparameter2.xlsx, then update the FilterDay query above to load data from this new Excel file instead:

let
    Source = 
	Excel.Workbook(
		File.Contents(
		"C:\FilterParameter2.xlsx"
		)
	, null, true),
    FilterDay_Table = 
	Source{[Item="FilterDay",Kind="Table"]}[Data],
    ChangedType = 
	Table.TransformColumnTypes(
		FilterDay_Table,
		{{"Parameter", type text}}
	),
    Output = 
	ChangedType{0}[#"Parameter"]
in
    Output

 

At this point, when you click the Data Source Settings button and look at the permissions for the file c:\filterparameter2.xlsx you will see that the privacy level is set to None:

image

However, it behaves as if it has a data privacy level of Public: the second query that gets data from SQL Server runs successfully, query folding still takes place and you are not prompted to set a data privacy level for this data source. Why?

The “None” data privacy level means that no privacy level has been set for this exact data source. However, when this happens the engine checks to see if a data privacy level has been set for the folder that this file is in and then for all folders up to the root. In this case, since the data privacy level has been set to Public for the c:\ drive, all files in all folders on that drive that have a data privacy level set to None (like this one) will inherit the c:\ drive’s setting of Public:

image

The same goes for databases on a SQL Server instance: they can inherit the data privacy settings set for the instance. The same is also true for web services, where data privacy settings can be set for different parts of a URL; for example, here’s the list of options for a call to the https://data.gov.uk/api/3/action/package_search web service described in part 2 of this series:

image

The general rule is that the engine looks for permissions for the exact data source that it’s trying to access, and if none are set then it keeps looking for more general permissions until it runs out of places to look.

In my opinion, I don’t think the way the “None” privacy level and inheritance works is very clear right now – it makes sense now I’ve had it explained to me, but the UI does nothing to help you understand what’s going on. Luckily it sounds like the dev team are considering some changes to make it more transparent. I would like to see the fact that data privacy levels have been inherited for a data source, and where they have been inherited from, called out in the Edit Permissions dialog.

8 responses

  1. Pingback: Configuring Power BI Gateway Data Sources For Files And Folders – Chris Webb's BI Blog

  2. Chris, do you know a way that I can turn off the “Native Database Query” warning message that goes on to say “Do you approve running this native query?” It is very annoying

  3. Pingback: Credentials, Data Privacy Settings And Data Sources In Power Query/Power BI « Chris Webb's BI Blog

  4. Pingback: Chris Webb's BI Blog: Data Privacy Settings In Power BI/Power Query, Part 4: Disabling Data Privacy Checks Chris Webb's BI Blog

  5. Hi Chris, thanks for the deep dive into this. I stumbled across your post when troubleshooting an issue with Excel’s Power Query engine intermittently retaining open cursors/locks on a DB2 data file (via ODBC) well after the query is done refreshing. Those cursors would remain open sometimes for 30 seconds or several hours, during which we couldn’t get an exclusive lock on our DB2 file to maniuplate it. I worked with IBM to determine that the problem is indeed with the application (i.e. Power Query) and a packet capture made it seem as if the query engine was relying on TCP resets to clear its cursors. I worked further with Microsoft who simply asked that I change the privacy level from None to Organizational. Upon doing so, it appears the cursors clean themselves up immediately after the refresh!

    I came here to learn more about how privacy levels impact this issue, and I presume the change in behavior has to do with the query engine folding several queries to the same data source.

    In any case, I found your post helpful and since I couldn’t find anything else on the interweb which related to my precise issue, I thought I’d leave this nugget here.

    Thanks again!

  6. Hello!

    Thank you for a very interesting series. I must admit that I still struggle a bit to understand the basics of these privacy settings though. I understand it as protection against sharing sensitive data within actual queries from one data source to another. Not any other type of network protection of the actual data beeing sent from one data source to another (the actual results of the queries). Correct?

    I have an actual use case where these settings have popped up. I have a created an excel file placed in folder structure only accesible by certain individuals that collect a bunch of other files in subfolders of the same directory. The master file that collects the other files have a table which the end user can modify which in turn filters things in the query that collects the other files. For this type of use case I’m guessing these privacy settings can safely be set to ignore, do you agree? Since this query is not sent to any other party I guess.

    Thanks again for a interesting blog posts on the subject! If I double posted this I’m sorry, I posted before but it didn’t seem to be created.

Leave a Reply to Gabriel Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: