Credentials, Data Privacy Settings And Data Sources In Power Query/Power BI

Recently I’ve been doing some more investigations into how data privacy settings work in Power BI. This is a subject I’ve blogged about in great detail already in a series of posts last year, but this functionality is so complex that there is always more to learn. I don’t have any profound new insights to offer; instead this blog post is a write up of a series of experiments whose results shed light onto how the process of setting data privacy levels works end-to-end.

Consider the following M query:

let
    Source = 
        Json.Document(
            Web.Contents(
                "https://data.gov.uk/api/3/action/package_search?q=cows"
                )
            ),
    result = Source[result],
    results = result[results],
    results1 = results{0},
    id = results1[id],
    output = 
        Json.Document(
            Web.Contents(
                "https://data.gov.uk/api/3/action/package_show?id=" & id
                )
            )
in
    output

The query does the following:

  • Runs a query against the UK government’s open data search API (the same API I use in this post on the RelativePath and Query options for the Web.Contents() function) to get a list of datasets related to the search term “cows” via the Package Search endpoint
  • Gets the first dataset returned by the search and finds its ID
  • Sends this ID to the Package Show endpoint in the same API to get the full JSON representation of this data set. Note that the entire URL is dynamically generated and that the Query option of Web.Contents() is not used here.

It’s a typical scenario where data privacy settings can cause problems: data from one data source, the package_search endpoint, is sent to another data source, the package_show endpoint. My series from last year on data privacy settings provides some useful background information on why this is such an important thing for the Power Query engine.

Assuming that you have never used this API before, when you try to run this query in the Power Query Editor in Power BI Desktop, you’ll see the following prompt to edit the credentials used:

image

Before you click the Edit Credentials button, there are two interesting things to point out. First, in the Query Dependencies view, you see this:

image

Notice that the Package Search endpoint is shown but not the Package Show endpoint.

Second, if you click the Data Source Settings button, you’ll see the following in the dialog that appears:

image

Not only does it only show the Package Search endpoint, there is a warning that says:

“Some data sources may not be listed because of hand-authored queries”

This refers to the output step in the query that calls the Package Show endpoint with the dynamically-generated url.

Closing this dialog and going back to the Query Editor, if you click the Edit Credentials button, you can set credentials for the data source (anonymous access is fine in this case). These credentials can be set at all levels in the path down to https://data.gov.uk/api/3/action/package_search.

image

Setting credentials at the level of https://data.gov.uk means you only get prompted once; however if you select https://data.gov.uk/api/3/action/package_search from the dropdown list and click Connect you will get prompted again to set credentials, this time with a dropdown that shows all paths down to package show:

image

Assuming you set credentials at the level of https://data.gov.uk and click Connect, then Fiddler shows that a call is made to https://data.gov.uk/api/3/action/package_search?q=cows, presumably to check whether the credentials entered actually work and you move back to the Query Editor.

image

Next, in the Query Editor, you see the data privacy settings prompt:

image

Clicking Continue brings up the data privacy levels dialog:

image

You have the choice to ignore privacy levels for this file, but of course you should always try to avoid doing that. You also have two dropdown boxes that both show https://data.gov.uk on the left-hand side and another two dropdown boxes next to them, although only the top one of these is enabled.

In the first column of dropdown boxes, in the first dropdown, you can see all points in the path from https://data.gov.uk to https://data.gov.uk/api/3/action/package_search:

image

In the dropdown box immediately underneath you can see for the first time all points in the path from https://data.gov.uk to https://data.gov.uk/api/3/action/package_show:

image

If you select https://data.gov.uk in the top-left dropdown only the top-right dropdown is enabled, and in the top-right dropdown you can set the data privacy levels Public, Organizational and Private.

image

The meanings of these three levels are described in my earlier series and here, and I won’t go into detail about what they mean in this post. The bottom-right dropdown is disabled because if you set privacy levels for https://data.gov.uk then all urls that start with this path inherit this setting. This is similar to what happens with the None data privacy setting that I describe here, I think.

Setting the data privacy level to Public on https://data.gov.uk in the top-right dropdown means the query runs successfully:

image

The expected activity is shown in Fiddler:

image

And at last the Package Show endpoint is shown in the Query Dependencies view:

image

The Data Source Settings dialog shows the same as it does above in the “Data sources in current file” tab, including the warning about hand-authored queries, but on the “Global permissions” tab there is now an entry for https://data.gov.uk :

image

Although you only set a privacy level for https://data.gov.uk earlier, it’s interesting to note that the entry for https://data.gov.uk/api/3/action/package_search has a privacy level set explicitly to Public and not to None:

image

Stepping back a few steps to the Privacy Levels dialog, if you set a privacy level of Private for https://data.gov.uk like so:

image

…then the query fails with the error “Formula.Firewall: Query ‘Query1’ (step ‘output’) is accessing data sources that have privacy levels which cannot be used together. Please rebuild this data combination.”:

image

From my point of view, this is the first really interesting finding: the two endpoints, https://data.gov.uk/api/3/action/package_search and https://data.gov.uk/api/3/action/package_show, are considered as separate data sources (which tallies with what is shown in the Query Dependencies view) even though they have both inherited their data privacy level setting from https://data.gov.uk. Since they are both Private then data cannot be sent from one source to the other, hence the error.

The second interesting finding becomes apparent if you follow the steps above with a slightly different version of the original query that uses the Query option in the call to the Package Show endpoint:

let
    Source = 
        Json.Document(
            Web.Contents(
                "https://data.gov.uk/api/3/action/package_search?q=cows"
                )
            ),
    result = Source[result],
    results = result[results],
    results1 = results{0},
    id = results1[id],
    output = 
        Json.Document(
            Web.Contents(
                "https://data.gov.uk/api/3/action/package_show",
                [Query=[#"id"=id]]
                )
            )
in
    output

In this case when you look in the Data Source Settings dialog you now see both endpoints listed and you no longer see the “hand-authored queries” warning:

image

It looks like whatever method it is that the Power Query engine searches for data sources inside a query is confused by dynamically generated urls – which might also explain why data sources that use dynamic urls can’t be refreshed after publishing.

3 responses

  1. Pingback: Experimenting With Power BI Data Privacy Settings – Curated SQL

  2. Pingback: Visual Vocab, Power Query Keyboard Shortcuts, Power BI Dataflows and more... (December 17, 2018) | Guy in a Cube

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: