Reusing A Power Query Connection

One of the things I like least about Power Query at the moment is the lack of control over where the results of your queries end up in Excel. For example, if you create a new query and load it onto a worksheet, your table always ends up in the top left hand corner of a worksheet. Furthermore, it seems like every time you refresh your query, then the table it’s bound to gets deleted and recreated so any extra calculated columns or formatting you have added to the table disappear.

There is, however, a way around this. If you go to the Data tab in Excel and click the Existing Connections button you can see the OLEDB connection that Power Query creates in the “Connections in this Workbook” section:


You can then double-click on the connection and create a new table from that connection, call it whatever you want, add formatting, calculated columns and so on and all of these things will remain intact when you refresh. Even better, when you refresh the table it will re-execute the underlying Power Query query. This is a much better experience for consuming the results of your Power Query query, in my opinion.

Incidentally, Miguel Llopis of the Power Query team told me on the forum that this is an area of functionality that they hope to address in early 2014, so hopefully this blog post will be redundant in a few months.

Implementing Common Calculations In Power Query

When I first saw Data Explorer (as Power Query used to be called), I immediately started thinking of it as a self-service ETL tool; indeed, it’s how Microsoft itself is positioning it. However I suspect that a lot of Power Query users might also want to use it for building reports too: it could be that they find DAX too difficult, and if all they need is a simple table-based report they might not want to bother with using Power Pivot at all.

If that happens then it’s going to be important to be able implement common business calculations in Power Query… hence this blog post. I’ll also admit that I wanted to know how easy it is to write these calculations – because if it isn’t easy, or if it isn’t easier than DAX or the equivalent Excel formulas, then no-one will bother doing it. So let’s look at some examples.

Simple Percentage Share

Take the following table in Excel as a starting point:


Now let’s imagine that we want to find the percentage that each fruit makes up of the grand total (ie 10+15+24=49). The way to calculate the percentage is quite easy: you need to find the grand total, then divide the sales for each fruit by the grand total. Finding the grand total requires some code though – you might think you could do it easily in the UI using the Group By functionality, but that does not allow you to sum up all the values in a column and not group by anything! (I hope this gets fixed soon). Luckily the code is quite easy: you use Table.Column() to convert the Sales column into a List, then use List.Sum() to sum up all the values in that List. Finally, you insert a new custom column to the original table and calculate the percentage. Here’s the code:


    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],

    Sales = Table.Column(Source, "Sales"),

    GrandTotal = List.Sum(Sales),

    TableWithShare = Table.AddColumn(Source, "Share", each [Sales]/GrandTotal)




It would also be nice if I could get the table that Power Query outputs to remember the number formatting I apply to it after it refreshes… again, I hope this gets fixed.

Nested Share

A more complex example of a Share calculation might be where we need a share of a subtotal. Consider the following table:


In this case, let’s say we need to show each Fruit’s sales as a percentage of the total sales of its Category. This time we can use the Group By functionality to get a table containing the subtotals very easily. Next, we can join our subtotal table with the original table using Table.Join() so we have the subtotals on each row. Finally, we can do the share calculation and remove the subtotal column again. Here’s the code:


    Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],

    Subtotals = Table.Group(Source, {"Category"}, {{"Category Sales", 

        each List.Sum([Sales]), type number}}),

    JoinTables = Table.Join(Source, "Category", Subtotals, "Category"),

    InsertedCustom = Table.AddColumn(JoinTables, "Share", each [Sales]/[Category Sales]),

    RemovedColumns = Table.RemoveColumns(InsertedCustom,{"Category Sales"})




Previous Period Growth

Now consider a table with a date column (Americans: the dates are in dd/mm/yyyy format) and a sales column:


Let’s say we want to find the absolute change in Sales from the previous date to the current date on each row. Here we need to copy the table, shift the dates forward one day (by adding a value of type Duration onto the date) and then join the original table onto this new table using Table.Join() so we can see the current day sales and the previous day sales in two columns next to each other; this will then mean we can subtract one from the other to find the growth. Here’s the code:


    Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],

    ChangedType = Table.TransformColumnTypes(Source,{{"Date", type date}}),

    InsertedCustom = Table.AddColumn(ChangedType, "Next Date", each [Date]+#duration(1,0,0,0)),

    RemovedColumns = Table.RemoveColumns(InsertedCustom,{"Date"}),

    RenamedColumns = Table.RenameColumns(RemovedColumns,{{"Sales", "Previous Sales"}}),

    JoinTables = Table.Join(ChangedType, "Date", RenamedColumns, "Next Date",JoinKind.LeftOuter),

    RemovedColumns1 = Table.RemoveColumns(JoinTables,{"Next Date"}),

    SortedRows = Table.Sort(RemovedColumns1,{{"Date", Order.Ascending}}),

    Growth = Table.AddColumn(SortedRows, "Growth", each [Sales]-[Previous Sales])




Year-to-Date Sales

For the final example, take the same starting table and before and now let’s calculate a year-to-date running sum. In this case, I decided to use a function (see here for some background on this, and here for a more complex example of a function) to do the calculation. Here’s the code of the function, called YTD(), that I created:


    YTD = (SalesTable, EndDate) =>



            each [Date]<=EndDate and Date.Year([Date])=Date.Year(EndDate) ),

        SalesColumn = Table.Column(FilteredTable, "Sales"),

        YTDSales = List.Sum(SalesColumn)





It takes a table structured in the same way as the source table (ie it has to have a [Date] column and a [Sales] column), filters the data using Table.SelectRows() so only rows with a date less than or equal to the EndDate parameter, but in the same year as it, are returned, and then sums up the values in the [Sales] column of the filtered table.

It’s then very easy to use this function in a new calculated column in a query:


    Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],

    ChangedType = Table.TransformColumnTypes(Source,{{"Date", type date}}),

    InsertedCustom = Table.AddColumn(ChangedType, "YTD Sales", each YTD(ChangedType, [Date]))



And voila, we have a column containing the year-to-date sum:



All of these examples here required me to write M code, and to be honest this is not something an ordinary user will ever be able to do. Furthermore, I suspect that end users will always prefer to write their calculations as Excel formulas, although more complex calculations such as nested shares might prove beyond them. However, I believe that a good data steward will certainly be able to write M code, and the last example above makes me think that if a data steward can write a function that performs a calculation and shares it with the end users, the end users should be able to use it in their own queries quite easily. I don’t think the code I’ve written here is any worse or better than the DAX equivalent but the ability to share functions could tip the balance in favour of Power Query for some simple reporting scenarios.

You can download a workbook with all the examples above here.

Technitrain 2014 Course Schedule: Power BI, Power Pivot, SSAS, MDX and more

I’m pleased to announce that all of the courses I’m running in London next year are now live on the Technitrain site. I’m teaching a lot of them, of course, but I’ve also got quite an impressive array of other trainers who’ll be teaching too: Allan Hirt, Klaus Aschenbrenner, Jamie Thomson, Bob Phillips, Andy Leonard and Thomas Kejser. If you’re looking for top quality SQL Server education then you know where to come!

Here’s the full list of courses:


Sharing Functions In Power Query

The Power BI Preview includes the ability to share Power Query queries with other people in your organisation; I was thinking about writing a blog post on this topic, but to be honest the documentation covers it so well there’s little point repeating what it says. My assumption is that it will be data stewards who do the majority of query creation and sharing, and that it will be less able users who consume these prebuilt queries.

Sharing a query is all very well, but in Power Query it’s also possible to share functions (which are, after all, just a different type of query) too – which is the subject of this post. What’s the point of doing this, I hear you ask? Well, if a data steward shares a query, they are sharing a fixed table of data that the end user can then manipulate however they want. If however a data steward shares a function, this means that they are giving end users something slightly different – an interface that returns a different table of data depending on the parameters that are passed into it, and which may be much easier to use. Sharing a query is a bit like creating a view in SQL Server; sharing a function is more like creating a stored procedure.

Here’s an example. Let’s imagine that a user who only has the most basic Power Query skills wants to get some data from the DimDate table in the Adventure Works database. A data steward could share a query that returns the entire table, but if the user only wants a few rows from that table they are going to have to use the query as a starting point and apply a filter themselves in their own query. On the other hand, a data steward could share a function that applied a filter to the DimDate table, and then when the user came to use that function they would just have to fill in some basic parameter values and they could get the data they needed much more easily.

The M script for the function could be something like this (note: I created this function by first creating a regular query in Power Query then altering it manually to make it into a function – it wasn’t very difficult):


    GetDates=(StartYear, StartMonth, StartDate, EndYear, EndMonth, EndDate)=>


    Source = Sql.Databases("localhost"),

    #"Adventure Works DW" = Source{[Name="Adventure Works DW"]}[Data],

    DimDate = #"Adventure Works DW"{[Name="DimDate"]}[Data],

    FilteredRows = Table.SelectRows(DimDate, each [FullDateAlternateKey] >=

          #date(StartYear, StartMonth, StartDate)

          and [FullDateAlternateKey] <= #date(EndYear, EndMonth, EndDate))





This function takes six parameters: the year, month and date for a start date and an end date. It then returns the rows from DimDate that fall between the start date and the end date. The data steward would just need to share this like any other query.

The end user, in Excel, would use the Online Search dialog in Power Query to find the function they wanted:


Clicking on Filter and Shape would bring up the Query Editor dialog:


They would then need to click the Invoke button to enter the parameter values:


And once they had clicked OK, they’d have the rows they wanted (though they would need to enable Fast Combine first):


The whole process isn’t as slick as I think it could be, but I still think it’s easy enough for anyone who has mastered Power Pivot. Of course this is a very simple example; the real advantage of this would come with more complex functions that contain expressions that an end user would never be able to write on their own.

One thing I’m still not sure of, though, is what happens exactly when a data steward updates the definition of a shared query or function. It seems as though once you have imported a shared query its definition is fixed, and even if the data steward updates the definition of the query any existing users of it are stuck with the old version (I’ve asked a question on the forum about this and will update this post when I get an answer). This might be the safe thing to do – after all, you don’t want a data steward changing what a shared query returns and breaking any code you have built on top of it – but it also means that if you did need to use the updated version you would need to import it a second time. It would be nice to have the option to refresh the definition of a query after you have imported it.

UPDATE: I got a reply to my question on shared queries from Miguel Llopis of the Power Query dev team:
“That’s correct: when sharing a query into the Data Catalog, you are sharing a copy of such query; when a user searches and uses a query from the results, they are making a local copy of the query into their workbook; when a shared query owner makes updates to the query, these updates overwrite the definition of the shared query in the catalog, but doesn’t propagate (or notify) changes to existing copies of the query.”

%d bloggers like this: