Sharing Power Query Queries With Azure Data Catalog

About a week ago, without any warning, a much-awaited new feature lit up in Azure Data Catalog: the ability to share Power Query queries between workbooks and users. In fact it’s not really a new feature but the reappearance of something that was present in the original version of Power BI for Office 365; it works in a very similar way, although some functionality like the option to search public data sources has now disappeared and some functionality seems to have changed.

How It Works

First, make sure you have an Azure Data Catalog subscription. You can sign up here and a free subscription is fine. If you want to learn more about Azure Data Catalog you can read my post from earlier this year which has a quick overview.

Now imagine that you have just created a really cool Power Query query that you think all of your colleagues will want to use. In Excel right click on the Power Query query that you want to share in the Query Pane, then select Send To Data Catalog:

image

You may need to sign in at this point – use the Organizational account that is associated with your Azure Data Catalog subscription.

image

Next you’ll see the Send to Data Catalog dialog. On the Query tab you can edit the description of the query and supply a URL to documentation:

image

You can also specify who the query is shared with:

image

Click Send and you have shared your query. At this point it will be visible in the Azure Data Catalog web portal along with all of your other assets:

image

Here you can also manage sharing, add more documentation, look at the columns returned and see a preview (if you enabled it when you shared the query). Unfortunately the Open In option is disabled at the time of writing, so you can’t open a new Excel workbook containing this query yet.

Back in Excel, if you want to use a shared query in a new workbook, you have two options on the New Query dropdown menu on the Data tab:

image

You can either search the catalog:

image

When you do this a new Search tab appears on the Excel ribbon, giving several different search options:

image

Alternatively, the My Data Catalog option allows you to see the queries you have shared:

image

Once you’ve found your query, you have two ways to consume it and it’s not immediately obvious what the differences are between them.

First you have the Load/Load To options that copy the query into your workbook and load its output to your destination of choice. At this point the new query runs like any other query, but when you open the Query Editor you’ll see it only has one step:

image

If you look at the M code you’ll see something like this:

let
    Source = Embedded.Value("959d482b-3b06-483c-84dd-f6fee2900bf9")
in
    Source

The actual query is embedded somewhere in the workbook but the M source code isn’t available for you to view or edit, you can only run it.

If you want to edit the query or see the M code you have to use the Open option in the Shared Queries pane:

image

If you do this a new workbook is created with this query in it, and in the Query Editor you’ll see you can edit this query as normal: all the steps and the M code are visible.

Finally, if you do change the query, you can update the definition or share it as a new query by using the Send To Data Catalog option again. When the Send To Data Catalog dialog appears you have two new options to update the existing shared query in the Data Catalog or to create a new shared query:

image

How Could It Be Improved?

While I’m really happy to have this functionality back, and I think a lot of people will find it useful, there’s still a lot of room for improvement. Some thoughts:

  • This really needs to extended to work with Power BI Desktop too. In fact, it’s such an obvious thing to do it must be happening soon…?
  • Both Power Query and Power BI should also extend their integration with Azure Data Catalog: you should be able to search for all types of data source and be able to create new queries from them. I know you can create new Power BI Desktop files and Excel files with Power Query queries using the Open In functionality in the Azure Data Catalog web portal, but that’s the wrong place to start the process in my opinion.
  • I find the difference between Load/Load To (where the query isn’t editable) and Open (where it is) confusing. It would be clearer to have options to download editable and non-editable versions.
  • It would be useful for Azure Data Catalog to store different versions of queries, so when you uploaded a query it didn’t overwrite the previous version and so you could roll back to an earlier version if you needed to. Source control, basically.
  • I’d like to see some kind of message appear in Excel or Power BI Desktop if a new version of a query I was using had been published, and then have the option to upgrade to the new version.
  • While it’s great to share queries in this way, it would also be cool to publish queries up to some kind of central place (a server, something cloudy) where they also executed and be available as a new data source. That way, other people could just consume the output of the query and not have to copy the query into their workbooks or Power BI reports. Maybe if you could publish an M query as an Azure Function…?

A Quick Look At Azure Data Catalog

I’ve been meaning to look at Azure Data Catalog for a long time but somehow never got round to it until now. I’m not sure why – perhaps it’s because there’s so much other cool stuff happening in the world of Power BI and Azure, and Azure Data Catalog seems a bit, um, boring in comparison. Maybe it’s a bit too much like documentation and all that other stuff we instinctively try to avoid. Whatever the reason, now I have looked at it I’m impressed and I can see its uses, not just for Power BI but also for traditional corporate BI scenarios. It certainly deserves a lot more attention from the Microsoft BI community than it seems to be getting at the moment.

First of all, what is Azure Data Catalog anyway? There’s lots of good documentation, videos and a tutorial here:
https://azure.microsoft.com/en-us/documentation/services/data-catalog/

…but the short answer is that it’s a cloud based service for cataloguing all the data sources you want to use for BI. It doesn’t store any of the data itself (except for a small amount for preview purposes, and you can turn that off), just the connection details; it also stores metadata (column names, data types etc) as well as descriptions, documentation and contact details for people who own or know about the data. End users can search the catalogue for data and even, for some data source types like SSAS, click a button to open that data source directly in Excel or Power BI Desktop.

image

Is it any good? Well, I managed to get going with it very easily. As well as an Azure subscription it does require you to have Azure Active Directory set up though, which is a pain – some of my customers (mostly the ones that have bought into Office 365 and Power BI) have it already, but I know for others it would be a showstopper. After setting everything up I was able to register SQL Server, SSAS and other types of data source very quickly using both the desktop application and the browser interface. It’s certainly simple enough for a reasonably technical end user to use and the implementation is very solid. I do have a few minor quibbles (I found the search results interface slightly confusing at first but soon came to grips with it) and it’s clear that it’s still very much a work in progress (right now the browser interface supports registering more data types than the desktop app, for example) but nothing very significant.

There is an obvious need for Azure Data Catalog if you are going to use Power BI and have embraced the whole concept of self-service BI: it makes it much easier for your users to find and consume the various different data sources that you have available. However, as I said earlier, I think on-premises users of SSAS and SSRS could also benefit from it too. It’s becoming increasingly common for me to see organisations with several SSAS cubes in production across various servers, each of them with a confusing list of measures and dimensions. Equally I’ve seen plenty of SSRS implementations with hundreds of reports, with nobody knowing what many of these reports do or even if they are used at all. Azure Data Catalog could provide a convenient central place to document and discover these cubes and reports. I guess it all depends on whether you can be bothered to do this documentation and annotation though, rather than whether you have a suitable tool for this purpose; I don’t think many organisations can be bothered, unfortunately.

One last thing: Azure Data Catalog is meant to include replacement functionality for the old Power BI for Office 365 Data Catalog but as yet it doesn’t allow you to publish or share Power Query queries. This is something that I and other Power Query/Power BI fans would love to see, especially if (unlike the old Data Catalog) it let you publish updates to your code without needing to manually update each Excel workbook/Power BI report that used it afterwards.