Tables.GetRelationships() Function In Power BI M

There’s a new M function in the latest release of Power BI Desktop that I spotted: Tables.GetRelationships(). Here’s the help page for it:


Basically, what it does is return all the relationships between all of the tables in a data source such as SQL Server. The documentation is a bit cryptic but I worked out what it does.

When you use a function like Sql.Database(), you get a table that contains a list of all of the tables in a database. For example, if I use the expression:

Sql.Database("localhost", "adventure works dw")

On my local instance of the Adventure Works DW database, this is the table that is returned:


This, it turns out, is the “navigation table” that the Tables.GetRelationships() function needs for its first parameter. The column called “Data” in this table, which contains links to the actual tables in the database, is what Tables.GetRelationships() needs for its second parameter. Put the two functions together in a query like this:

    Source = Sql.Database("localhost", "adventure works dw"),
    Rels = Tables.GetRelationships(Source, "Data")

And you get the following table:


Expand all the columns here and you get a table with one row for every relationship detected between every table in the database:


Useful if you need to report on the structure of a database, I guess. It’s a shame that this isn’t available in Power Query in Excel yet (it isn’t as of September 2015, I assume it’s coming soon) because it would be cool to use this data with NodeXL.

Calculated Tables In Power BI

There are loads of great new features in today’s release of Power BI Desktop, but for me the most important by far is the introduction of calculated tables. Miguel Llopis gives a good introduction to what they are in the post announcing the release, but I thought it was worth going into a bit more detail about what they are and why they’re so useful.

What are calculated tables?

Calculated tables are tables in the Data Model whose data source is a DAX expression that returns a table. Here’s a simple example. Imagine that you have already imported the DimDate dimension table from the Adventure Works DW database into your Power BI Data Model. If you go to the Data tab you would see the contents of that table shown, and on the ribbon you can see the new New Table button:


Clicking the button allows you to enter a new table name and a DAX expression that returns the table, such as this one that returns a filtered subset of the rows in the DimDate table:

MyFilteredDimDateTable =
    DimDate[CalendarYear] = 2002
        && DimDate[EnglishMonthName] = "June"
        && DimDate[DayNumberOfMonth] < 5


Calculated tables are created when the data in the model is refreshed (like calculated columns), and after that behave like any other table – so you can create relationships between calculated tables and other tables. You can also create calculated tables whose DAX expressions reference other calculated tables. They do take up memory like other tables too, so over-using them could be a bad thing.

Why are calculated tables useful?

Miguel’s blog post already lists some of the scenarios where calculated tables are useful, and I can already think of lots of practical scenarios where I’m going to be using them myself.

Role playing dimensions are one obvious use: in a lot of models you need to use the same dimension table more than once in different places, with different relationships and maybe with different filters in place. It might be that you have a single Company dimension in your data warehouse that contains all of the companies your organisation does business with; with calculated tables you only need to import that table once, and you can then use calculated tables to create filtered copies of that table to use as Supplier and Customer dimension tables, joining them to your Stock and Sales fact tables, and only showing the relevant companies in each case.

Certainly for debugging complex DAX expressions they’re going to be handy, because they allow you to see what DAX table expressions return. We’ve already DAX Studio for that but now we don’t have the hassle of switching to another application…!

I can also see calculated tables as a way of doing certain types of ETL – which raises the question of whether you should do a certain operation in Get Data (ie what was Power Query) or using a calculated table. I strongly suspect that a lot of operations are going to be much faster with calculated tables because of the power of the underlying engine. It would be interesting to know if there are plans to allow Get Data to make use of calculated tables, for example as a way of buffering tables in memory, with M transformations folded back to DAX on those tables.

The Calendar() and CalendarAuto() functions

If you were wondering what the new DAX Calendar() and CalendarAuto() functions were for, well, you can probably guess now – Date tables. The Calendar() function returns a table of continuous dates within a given range. So, the expression

CalendarDemo =
CALENDAR ( "1/1/2015", "2/2/2015" )

Will return a table with one column containing all the dates from January 1st 2015 to February 2nd 2015:


The CalendarAuto() function looks at all of the Date columns in all of the other tables in the model, and returns a similar table but one where the first date is the beginning of the year that contains the earliest date found in any non-calculated column in any non-calculated table, and where the last date is the end of the year that contains the latest date found in any non-calculated column in any non-calculated table. By default the beginning of the year is January 1st and the end of the year is December 31st, but there’s also an optional parameter to specify a different month to end the year on, if you want to create a fiscal calendar table.

I wonder if we’ll get something like the functionality in Excel 2016 for creating date tables with other useful columns?

Referencing Individual Cell Values From Tables In Power Query

[This blog post is relevant to Power Query in Excel 2010/2013, the Get & Transform section on the Data tab in Excel 2016, and the Get Data screen in Power BI Desktop. I’m going to use the term ‘Power Query’ in this post to refer to all of the previously mentioned functionality]

Sometimes, when you’re working with a table of data in Power Query, you want to be able to get the value from just one cell in that table. In this blog post I’ll show you how you can do this both in the UI and in M code, and talk through all of the more advanced options available in M. Incidentally this is a topic I covered in some detail in the M chapter of my Power Query book, but since that book is now somewhat out-of-date I thought it was worth covering again in a blog post.

Referencing Cell Values In The UI

Imagine your data source is an Excel table that looks like this:


If you import it into Power Query, and you want to get the value in the cell on the second row in ColumnB, then in the Query Editor window you just need to right-click on that cell and select Drill Down:


…and bingo, you have the value 5 returned:


Note that this is the value 5, and not the value 5 in a cell in a table – a Power Query query can return a value of any data type, and in this case it’s going to return a single integer value rather than a value of type table. If you load the output of this query into Excel you’ll still see it formatted as a table, but if you’re using the output of this query as an input for another query (for example, you might want to read a value from Excel and use that value as a filter in a SQL query) it’s much more convenient to have an integer value than a table with one column and one row.

Referencing Cell Values in M

You can see in the screenshot above the M code generated for the Drill Down by the UI, and probably guess how it works. Here’s a cleaned-up version of the query from the previous section for reference:

    Source = Excel.CurrentWorkbook(){[Name="SourceData"]}[Content],
    ChangeDataTypes = Table.TransformColumnTypes(
                                        {{"ColumnA", Int64.Type}, 
                                         {"ColumnB", Int64.Type}, 
                                         {"ColumnC", Int64.Type}}),
    GetMiddleCell = ChangeDataTypes{1}[ColumnB]

There are three steps here:

  • Source reads the data from the Excel table
  • ChangeDataTypes sets the data types for the three columns in the table to be Whole Number
  • GetMiddleCell returns the value from the middle cell of the table returned by the ChangeDataTypes step

M allows you to refer to individual values in tables by a system of co-ordinates using the name of the column and the zero-based row number (tables in Power Query are always assumed to be sorted, therefore it making it possible to ask for a value from the nth row). So the expression

returns the value from the cell on the second row in the column called ColumnB of the table returned by ChangeDataTypes, which is 5. Similarly, the expression

returns the value 3, which is the value in the column ColumnC on the first row.

It’s also worth pointing out that the row and column reference can be in any order, so the expression


…returns the same value as


As you’ll see in a moment, the order that the row and column reference come in could be important.

Referring To Rows Or Columns That Don’t Exist

What happens if you write an expression that refers to a row and/or column that doesn’t exist? You get an error of course! So using our example query, the expressions




…will both return errors because there isn’t a fifth row in the table, and there isn’t a column called ColumnD.

However, instead of an error you can return a null value by using the ? operator after the reference. For example, the expression


returns the value null instead of an error:


You have to be careful though! The expression


still returns an error, not because there isn’t a fifth row but because the reference to the fifth row returns a null value and there is no column called ColumnB in this null value.


The solution here is to reverse the order of the references, like so:


or even better use ? with both references:



Unfortunately using ? won’t stop you getting an error if you use a negative value in a row reference.

The Effect Of Primary Keys

Did you know that a table in Power Query can have a primary key (ie a column or columns whose values uniquely identify each row) defined on it? No? I’m not surprised: it’s not at all obvious from the UI. However there are several scenarios where Power Query will define a primary key on a table, including:

  • When you import data from a table in a relational database like SQL Server, and that table has a primary key on it
  • When you use the Remove Duplicates button to remove all duplicate values from a column or columns, which behind the scenes uses the Table.Distinct() M function
  • When you use the Table.AddKey() M function on a table

The presence of a primary key affects how the Drill Down functionality works, and gives you another way of referencing individual cells.

Consider the following Excel table, basically the same as the table you’ve seen already but with a new column that uniquely identifies each row:


If you load the table into Power Query and then right-click on the column MyKeyColumn and select Remove Duplicates, you will have set this key as a primary key:


(By the way, you can use the Table.Keys() function to see what keys are defined on a table in Power Query, and there’s an example query showing how to use this in the sample workbook for this post).

Having done this, if you use the Drill Down functionality to get the value from the second row in ColumnB once again, this time you’ll get a query something like this:


    Source = Excel.CurrentWorkbook(){[Name="SourceData3"]}[Content],
    ChangedDataTypes = Table.TransformColumnTypes(
                                      {{"MyKeyColumn", type text}, 
                                      {"ColumnA", Int64.Type}, 
                                      {"ColumnB", Int64.Type}, 
                                      {"ColumnC", Int64.Type}}),
    RemovedDuplicates = Table.Distinct(
    SecondRow = RemovedDuplicates{[MyKeyColumn="SecondRow"]}[ColumnB]

The last step is the important one to look at. The row in the reference is no longer by the row number but by the value from the primary key column instead:



You can still use the previous row number-based notation, but when a table has a primary key column defined on it you can also use a value from the primary key column to identify a row.

A Last Warning About Performance

Being able to reference individual values like this is incredibly useful for certain types of query and calculation. However, bear in mind that there are often many different ways of solving the same problem and not all of them will perform as well as each other. One obvious use of the techniques I’ve shown in this post would be to write a previous period growth calculation, where you need to refer to a value in a previous row in a table – but my experience is that writing calculation using row and column references prevents query folding and leads to poor performance, and an alternative approach (maybe like the ones shown here and here involving joins) often performs much better. There aren’t any general rules I can give you though, you just need to make sure you test thoroughly.

You can download the sample workbook for this post here.

Excel 2016 BI Branding Changes

Office 2016 is on the verge of being released, and although Power BI is the cool new thing Excel 2016 has added several new BI-related features too. What is also interesting – and less well publicised – is that several of the BI features in Excel 2016 have been rebranded. Specifically:

  • Power Query (which is now no longer an add-in, but native Excel functionality) is not called Power Query any more, but “Get & Transform”. It has also been squeezed onto the Data tab, next to the older data import functionality:


  • Power Map is not called Power Map any more, but “3D Maps”
  • Power View is still Power View, but as John White points out here it is no longer visible on the ribbon by default, hidden from new users, although it’s easy to add the Power View button back onto the ribbon. Power View in Excel 2016 is unchanged from Power View in Excel 2013. Read into this what you will.
  • Although Power Pivot still has its own tab on the ribbon (and has finally got a space in the middle of its name), there’s also a “Manage Data Model” button on the Data tab in the ribbon that is visible even when the Power Pivot add-in has not been enabled:
    Clicking this button opens the Power Pivot window. There’s a subtle distinction between Power Pivot the add-in and the Excel Data Model (which is the database engine behind Power Pivot, and which is present in all Windows desktop editions of Excel regardless of whether the add-in is enabled or not) that has existed since Excel 2013 and which is generally unknown or misunderstood. The fact this button is titled “Manage Data Model” rather than “Power Pivot” is telling.
  • All the add-ins now have the collective name “Data Analysis add-ins” and can be enabled with a single click:

So, clearly Excel has moved away from branding all its BI functionality as Power-something. My guess, informed by various conversations with various people in the know, is that this has happened for a couple of reasons:

  • The ‘Power’ prefix was intimidating for regular Excel users, who thought it represented something difficult and therefore not for them; it also made it look like this was functionality alien to Excel rather than a natural extension of Excel.
  • Having separate Power add-ins led to a disjointed experience, rather than giving the impression that all of these tools could and should be used together. It also made comparisons, by analysts and in corporate bake-offs, with other competing tools difficult – were the Power-addins separate tools, or should they be considered a single tool along with Excel?
  • Previously there was a lot of confusion about whether these add-ins are anything to do with ‘Power BI’ or not. Up to now, depending on who you talked to, they either were or weren’t officially part of Power BI. Now there is a clear distinction between Excel and Power BI, despite the close technical relationships that remain.

The new names certainly address these problems and on balance I think this change was the right thing to do, even if I was quite annoyed when I first found out about them. There are significant downsides too: for example, changing the names means that several years of books, blog posts, articles and conference presentations about Power Query and Power Map now won’t be found by new users when they search the internet for help. Similarly, it won’t be obvious to new users that a lot of content is relevant for both Power BI Desktop and Excel. Now that the Power Query name has been de-emphasised, why should anyone looking at my old blog posts on that subject know that what I’ve written is still relevant for Excel 2016’s “Get & Transform” and Power BI Desktop? What would I call a second edition of my Power Query book, if I wrote one, given that Power Query exists only as the relatively nondescript “Get & Transform” in Excel 2016 and “Get Data” in Power BI Desktop?

Loading Twitter Archive Data In Power Query

If you’re a Twitter user (I’m @Technitrain if you aren’t following me already) you may be aware that you can download your entire Twitter history – all of the data from all of your tweets – as a series of .js files. All the details on how to do this are here:

Since Power Query can work with JSON data I thought it would be easy to use it to analyse my Twitter archive… but while it’s possible, it’s not entirely straightforward. The problem seems to be that Power Query doesn’t like the first line in each .js file that Twitter gives you. Removing that first line isn’t too difficult but it requires some M code, so here’s a function (I’ve called it GetTwitterArchiveFile) that handles that problem. You give it the binary data from the file and it returns a table containing all the data from that file:

(TwitterFile as binary)=&gt;
    //Read data from file and interpret as Lines
    Source = Lines.FromBinary(TwitterFile),
    //Remove the first line
    RemoveFirstRow = List.Skip(Source,1),
    //Convert back to Binary
    ConvertBackToBinary = Lines.ToBinary(RemoveFirstRow),
    //Now convert to JSON
    ConvertToJSON = Json.Document(ConvertBackToBinary),
    //Flatten to a table
    ConvertToTable = Table.FromList(ConvertToJSON, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    //Expand the first set of columns
    ExpandColumns = Table.ExpandRecordColumn(ConvertToTable, &quot;Column1&quot;, {&quot;source&quot;, &quot;entities&quot;, &quot;geo&quot;, &quot;id_str&quot;, &quot;text&quot;, &quot;id&quot;, &quot;created_at&quot;, &quot;user&quot;, &quot;retweeted_status&quot;, &quot;in_reply_to_status_id_str&quot;, &quot;in_reply_to_user_id&quot;, &quot;in_reply_to_status_id&quot;, &quot;in_reply_to_screen_name&quot;, &quot;in_reply_to_user_id_str&quot;}, {&quot;source&quot;, &quot;entities&quot;, &quot;geo&quot;, &quot;id_str&quot;, &quot;text&quot;, &quot;id&quot;, &quot;created_at&quot;, &quot;user&quot;, &quot;retweeted_status&quot;, &quot;in_reply_to_status_id_str&quot;, &quot;in_reply_to_user_id&quot;, &quot;in_reply_to_status_id&quot;, &quot;in_reply_to_screen_name&quot;, &quot;in_reply_to_user_id_str&quot;})

Here’s an example of how to use the above function: it’s another function (called GetTwitterFullArchive) which, when you pass it the path of your tweets folder (this will be wherever you unzipped the download that you get from Twitter) returns the combined contents of all of the .js files in that format by calling GetTwitterArchiveFile() for each one:

(TweetsFolderPath as text) =&gt;
    //Connect to Tweets folder
    Source = Folder.Files(TweetsFolderPath),
    //Remove everything but Content column
    #&quot;Removed Other Columns&quot; = Table.SelectColumns(Source,{&quot;Content&quot;}),
    //Use Custom Column to call GetTwitterArchiveFile for each .js file in the folder
    #&quot;Added Custom&quot; = Table.AddColumn(#&quot;Removed Other Columns&quot;, &quot;Custom&quot;, each GetTwitterArchiveFile([Content])),
    //Remove the Content columns
    #&quot;Removed Columns&quot; = Table.RemoveColumns(#&quot;Added Custom&quot;,{&quot;Content&quot;}),
    //Expand all columns in the Custom column
    #&quot;Expanded Custom&quot; = Table.ExpandTableColumn(#&quot;Removed Columns&quot;, &quot;Custom&quot;, {&quot;source&quot;, &quot;entities&quot;, &quot;geo&quot;, &quot;id_str&quot;, &quot;text&quot;, &quot;id&quot;, &quot;created_at&quot;, &quot;user&quot;, &quot;retweeted_status&quot;, &quot;in_reply_to_status_id_str&quot;, &quot;in_reply_to_user_id&quot;, &quot;in_reply_to_status_id&quot;, &quot;in_reply_to_screen_name&quot;, &quot;in_reply_to_user_id_str&quot;}, {&quot;source&quot;, &quot;entities&quot;, &quot;geo&quot;, &quot;id_str&quot;, &quot;text&quot;, &quot;id&quot;, &quot;created_at&quot;, &quot;user&quot;, &quot;retweeted_status&quot;, &quot;in_reply_to_status_id_str&quot;, &quot;in_reply_to_user_id&quot;, &quot;in_reply_to_status_id&quot;, &quot;in_reply_to_screen_name&quot;, &quot;in_reply_to_user_id_str&quot;})
    #&quot;Expanded Custom&quot;

Invoking this function in a query, for example like this:

    Source = GetFullTwitterArchive(&quot;C:\Users\Chris\Downloads\TwitterArchive\data\js\tweets&quot;)

Gives you the following output:


As you can see, there are plenty of other columns that can themselves be expanded, but this is a good starting point for any analysis.

There’s nothing really ground-breaking in what I’ve done here – it’s a fairly standard example of how you can use Power Query functions to combine data from multiple files, very similar to this example of combining data from multiple Excel files.

There’s absolutely loads of interesting data that you can play with here, but to start with here’s a query that finds the top 10 people I have replied to on Twitter:

    Source = GetFullTwitterArchive(&quot;C:\Users\Chris\Downloads\TwitterArchive\data\js\tweets&quot;),
    #&quot;Removed Other Columns&quot; = Table.SelectColumns(Source,{&quot;in_reply_to_screen_name&quot;}),
    #&quot;Grouped Rows&quot; = Table.Group(#&quot;Removed Other Columns&quot;, {&quot;in_reply_to_screen_name&quot;}, {{&quot;Count&quot;, each Table.RowCount(_), type number}}),
    #&quot;Filtered Rows&quot; = Table.SelectRows(#&quot;Grouped Rows&quot;, each ([in_reply_to_screen_name] &lt;&gt; null)),
    #&quot;Sorted Rows&quot; = Table.Sort(#&quot;Filtered Rows&quot;,{{&quot;Count&quot;, Order.Descending}}),
    #&quot;Kept First Rows&quot; = Table.FirstN(#&quot;Sorted Rows&quot;,10)
    #&quot;Kept First Rows&quot;

Here’s the output as a table and chart:



Looks like Jamie Thomson is the winner by a wide margin!

Here’s an example of a  NodeXL graph I built from this data, showing the relationships between users who I have mentioned together in a single tweet:


You can download the sample workbook for this post, containing all the functions (but not my data) here.

Exporting All M Code From Power Query In Excel 2013

Here’s a tip that I found out about on the Power Query Technet forum that I thought was worth repeating. If you ever need a quick way of exporting all the M code for all of the queries in an Excel 2013 workbook, just open the Power Query window and click the Send a Frown button shown here:


Then, when the Send Feedback dialog appears, make sure Include Formulas is ticked then click OK:


When you do that, you’ll get an email created for you that contains a whole lot of debugging information, plus all of the M code for your queries:


Obviously, don’t send this email to Microsoft!

It’s quite easy to see the individual queries. You then need to go to your new workbook, create a new query by selecting the Blank Query option under Other Sources, and then open the Advanced Editor window and paste the code for each query in. However, when you do that you will need to modify the code a bit. There are three pieces of code you will need to delete:

  • At the beginning of the M code, where it says
    section Section1;
  • At the beginning of each query, a piece of code that contains the original name of the query:
    shared MyOriginalQueryName =
  • At the very end, a semi-colon

Of course in Excel 2016 we’ll have much better options for copying, pasting and moving queries in VBA, but until then we’ll have to live with hacks like this.

Thoughts On the Microsoft/Pyramid Power BI Deal

Ever since Power BI first appeared, the number one request from customers has been the ability to publish reports and dashboards to an on-premises server rather than the cloud. There were two standard responses from Microsoft to this request:

  • la-la-la-I’m-not-listening-la-la-la-cloud-la-la-la-future-la-la-la
  • Maybe we’ll have something in SharePoint

…neither of which were particularly satisfying. A lot of businesses just don’t feel comfortable with the cloud yet, and more importantly a lot of businesses can’t put their most valuable data in the cloud for a variety of legal and regulatory reasons.

Today’s announcement of a deal between Microsoft and Pyramid Analytics is big news because it means Microsoft have got a credible answer to the Power-BI-on-premises question at last. For details, read the blog posts here and here, if you haven’t already, plus the Pyramid press release and this page on the Pyramid site. It’s not a perfect solution – I had been hoping that Microsoft would unveil an on-prem Power BI server that they had been working on in secret – but it’s a lot better than what we had before. It also ties up nicely with existing on-premises SQL BI investments that customers may have, and does so without a cumbersome SharePoint dependency.

What has been announced, exactly? From the Pyramid press release:

Pyramid Analytics … today announced a strategic collaboration with Microsoft consisting of development collaboration and technology integration.

The output of this new collaboration is a range of new features in the Power BI Desktop at expected General Availability on July 24. Among those features will be an option to publish a Power BI Desktop file to Pyramid Analytics Server. This feature will enable an ease of integration between the Power BI Desktop and the Pyramid Analytics Server.

From the Pyramid blog post:

This was the result of a strategic collaboration agreement that included:

  • Pyramid Analytics help in the Power BI Desktop development
  • Technology integration to publish Power BI content to Pyramid Analytics Server
  • Go-to-market coordination so mutual customers and partners get the best of our technologies

Here are some screenshots that Pyramid gave me showing what the ‘Publish to Pyramid’ feature will look like in Power BI Desktop:



Obviously this is great news for Pyramid and I’m sure it will be a big boost for their business. They are certainly one of the leading third party client tools for SSAS and, I guess, the one with the biggest presence worldwide. Many of my customers are also customers of theirs, and I’ve always been impressed with their products. It’s interesting that this is a partnership rather than an acquisition… maybe, given the large number of ex-Proclarity guys at Pyramid, they had no desire to see history repeat itself?

For Microsoft, I think it’s the best possible solution in the circumstances. Ever since the Power BI reboot last year, Microsoft’s BI strategy has become a lot more pragmatic – something that I, and pretty much the whole of the Microsoft BI community, have welcomed. Rather than pursue a grandiose strategy that fails on the details, the new focus is now on [shock!] building a product that people not only want to buy, but can buy easily. Some compromises have had to be made based on the position that Microsoft has found itself in, though. I guess with all the resources that are being thrown at Power BI V2, plus the development effort that is going into on-premises BI features in SQL Server 2016, it proved easier to partner with a third party vendor that already has a mature product and spare development resources, than build something from scratch.

There are some downsides to all this, though. First of all, I feel sorry for the third-party client tool vendors that aren’t Pyramid, who must be feeling hard done by right now. That’s business, I suppose. Second, Pyramid’s on-premises solution is yet another dashboarding/reporting component that must be understood and fitted in to the Microsoft BI story along with Power BI Desktop, Excel, Reporting Services, Datazen, PerformancePoint (RIP), Excel Services and so on, making the Microsoft BI pro’s life even harder still. Similarly, the Pyramid brand is likely to confuse customers who really just want to buy a single, Microsoft-branded solution (and don’t get me started on the whole Power BI/Cortana Analytics Suite branding overlap thing either).

Overall, I’m very happy as a result of this news. What with this, and the RTM of Power BI v2 tomorrow, Microsoft is back as a serious contender in BI both on-premises and in the cloud, and is catching up with the competition incredibly quickly.