OData Performance Improvements In The June 2018 Power BI Desktop Release

In the June 2018 release of Power BI Desktop there were a number of improvements made to the way the Power Query engine handles OData data sources. You can read about them here:

https://powerbi.microsoft.com/en-us/blog/power-bi-desktop-june-2018-feature-summary/#oData

However, while testing them out, I noticed one important point that the announcement didn’t make: an existing Power Query query will only benefit from these changes if you make a small change to its M code, adding the Implementation=”2.0” option to the OData.Feed() function.

Take the following M query, running on the UK Houses of Parliament OData API:

let
    Source = OData.Feed("https://api.parliament.uk/odata"),
    GovernmentOrganisation_table =
    Source{[
        Name="GovernmentOrganisation",
        Signature="table"
        ]}[Data],
    #"Expanded GroupHasPosition" = 
        Table.ExpandTableColumn(
            GovernmentOrganisation_table, 
            "GroupHasPosition", 
            {"PositionName"}, 
            {"PositionName"}),
    #"Filtered Rows" = 
        Table.SelectRows(
            #"Expanded GroupHasPosition", 
            each [PositionName] = "Chancellor of the Exchequer")
in
    #"Filtered Rows"

It queries a table called GovernmentOrganisation, expands a column called GroupHasPosition and then filters on one of the expanded columns, PositionName, to only return the rows where PositionName contains the text “Chancellor of the Exchequer”.

image

Using Fiddler in the way I describe here, I can see that when this query runs the engine first tries to fold the filter on “Chancellor of the Exchequer” and then when this request returns an error, it defaults to a very slow approach that involves making multiple requests to the API:

image

However, if you change the code above so that the OData.Feed() function uses the Implementation=”2.0” option like so:

let
    Source = OData.Feed(
        "https://api.parliament.uk/odata", 
        null, 
        [Implementation="2.0"]),
    GovernmentOrganisation_table =
    Source{[
        Name="GovernmentOrganisation",
        Signature="table"
        ]}[Data],
    #"Expanded GroupHasPosition" = 
        Table.ExpandTableColumn(
            GovernmentOrganisation_table, 
            "GroupHasPosition", 
            {"PositionName"}, 
            {"PositionName"}),
    #"Filtered Rows" = 
        Table.SelectRows(
            #"Expanded GroupHasPosition", 
            each [PositionName] = "Chancellor of the Exchequer")
in
    #"Filtered Rows"

…then Fiddler shows a completely different request made:

image

In this case it does not try to fold the filter, but it at least does the expansion in a single request. The performance of the resulting query is a lot better.

It looks like all new code generated by the Power Query Editor uses the Implementation=”2.0” option for OData.Feed() so it will get the benefits described in the post on the Power BI blog. Existing M code won’t have this option set though so you will need to update it appropriately. As always you should test thoroughly after making these changes to make sure you do indeed make your query run faster and get the same behaviour as you had before.

Line Breaks And Auto-Indent In The Power BI DAX Formula Bar

The other day I discovered something new (at least to me) while writing the DAX for a measure in Power BI Desktop: when you insert a new line in your DAX expression using SHIFT-ENTER it also auto-indents the code. I asked a few people if this was new because I was sure I hadn’t seen it before, even though I always put line breaks in my code; of course Marco had and said he thought it had been around for a while. Anyway, Marco then commented that most people didn’t know you could even put line breaks in DAX and I thought to myself I should probably write a short blog post about all this, because of course line breaks and indentation make your code much more readable.

Here’s what it looks like to write a DAX measure in Power BI Desktop using line breaks and auto-indent:

DAXIndent

To summarise:

  • Press the SHIFT and ENTER keys on the keyboard simultaneously to insert a line break in your DAX code when you are working in the Power BI DAX formula bar
  • Do this after the opening bracket of each function and after the comma of each function argument and the auto-indent will kick in
  • Bonus tip: while typing DAX, if the dropdown list has the function, table, column or measure selected that you want to use, just hit the TAB key to auto-complete. If the dropdown list hasn’t selected what you are looking for and you don’t want to keep typing, use the up and down arrow keys on the keyboard to move up and down the list until you have selected what you want.

Of course there’s always Marco and Alberto’s excellent DAX Formatter service to format your DAX code but this is a lot more convenient.

Dynamically Changing A Chart Axis In Power BI Using Bookmarks And Buttons

A very common requirement when building Power BI reports is to allow the end user to change what is displayed on a chart axis dynamically. A lot of people have blogged about how to do this – Kasper’s blog post here is a great example – but the problem is that all of these solutions involve a lot of work remodelling your data and writing DAX code. However, the good news is that now we have Bookmarks and Buttons in Power BI there’s a new, easy, code-free way of achieving the same result, at least for some chart types. In this post I’ll show you how using the same data that Kasper used in his post.

Say you have the following dataset (using data from the Adventure Works DW sample database) in Power BI Desktop:

image

…and you need to display a column chart that shows the sum of SalesAmount broken down by either Country, Region or Currency.

The first step is to create a column chart and to drag Country, Region and Currency into the Axis well:

image

At this point the column chart will show Country and you’ll have the option to drill down – but don’t drill down yet. Add a Bookmark at this point and call it Country. Do not turn on drill down mode, but click on the “Go to the next level in the hierarchy” button:

image

When you do this, Country will be completely replaced by Region:

image

Add another Bookmark called Region, then click “Go to the next level in the hierarchy” again to show Currency:

image

Add a third and final Bookmark and call this one Currency. At this point you should have three Bookmarks for the three drill states:

image

The last thing to do is to add three buttons to the report linked to the three Bookmarks:

image

In this case I’ve used the “blank” button type, turned on the Outline, added button text that matches the name of the Bookmark, and set the Action Type property to “Bookmark” and then selected the appropriate Bookmark in the Bookmark property. Here’s how my Country button is configured:

image

image

image

image

And that’s it. After the report is published (notice that I’ve also used the new ability to turn off the visual header which makes everything look much tidier) you’ll be able to click the three buttons and switch between viewing Country, Region and Currency like so:

PQSwitchAxis

Of course this only approach works with visuals like the column chart that support drilldown so you can’t use it in all cases, but it does show off how powerful and useful the Button/Bookmark combination is. Ideally the Selection Pane would be able to control the visibility not just of entire visualisations but also of the fields and measures used within a visualisation, which would enable even more scenarios like this.

You can download the sample .pbix file for this post here.

Removing Punctuation From Text With The Text.Select M Function In Power BI/Power Query/Excel Get&Transform

A new, as-yet undocumented, M function appeared in the December 2017 release of Power BI Desktop (I assume it will appear in Excel soon): Text.Select. Here’s the documentation from the Query Editor:

 TextSelect

It’s very easy to use: the first parameter takes a text value, the second parameter takes either a text value containing a single text value or a list of single characters, and it returns the text from the first parameter minus all characters that are not in the second parameter. For example, the expression:

Text.Select("Hello", "l")

…returns the text value “ll”:

image

…and the expression:

Text.Select("Hello", {"H","e","o"})

…returns the text value “Heo”:

image

There are a lot of scenarios where Text.Select will be useful, and the one that I immediately thought of was to remove punctuation from text. In one of my earliest M posts on this blog I used Text.Remove to do this while trying to find Shakespeare’s favourite words, but the problem with this approach is that you have to explicitly specify all the characters you want to remove from your text – and there could be a lot of characters that need to be excluded. Text.Select is a much better option here because it allows you to specify the characters you want to keep.

The first step to doing this is to understand how to construct the list of the characters you do want to keep. You can do this very easily in M when declaring a list using the range technique I blogged about here, so you should read that post before carrying on. The following expression returns a list containing all 26 uppercase and lowercase letters in the alphabet plus a space:

List.Combine({{"A".."Z"},{"a".."z"},{" "}})

image

Of course depending on the scenario or language you’re working with you may want to include other characters, for example apostrophes or letters with accents, too. Here’s a slightly more complex example of how this list can be used with Text.Select:

let
    SourceText = "Hi! Stop, please. What is your name?",
    CharsToInclude = List.Combine({{"A".."Z"},{"a".."z"},{" "}}),
    RemovePunc = Text.Select(SourceText, CharsToInclude)
in
    RemovePunc

The query above takes the text “Hi! Stop, please. What is your name?” and returns the text “Hi Stop please What is your name”.

image

Finally, because I couldn’t read my old M code without cringing a little bit, here’s an updated version of my query that gets the top 100 words from the Complete Works Of Shakespeare (direct from the Project Gutenberg website):

let
  URL = "http://www.gutenberg.org/cache/epub/100/pg100.txt",
  Source = Text.FromBinary(Web.Contents(URL)),
  Lowercase = Text.Lower(Source),
  RemovePunctuation = Text.Select(Lowercase, 
	List.Combine({{"a".."z"},{" "}})),
  SplitText = Splitter.SplitTextByWhitespace(QuoteStyle.None),
  SplitIntoWords = SplitText(RemovePunctuation),
  RemoveBlanks = List.Select(SplitIntoWords, each _<>" "),
  TableFromList = Table.FromColumns({RemoveBlanks},
	type table [Word=text]),
  FindWordCounts = Table.Group(
	TableFromList, 
		{"Word"}, 
		{{"Count", each Table.RowCount(_), type number}}),
  SortedRows = Table.Sort(
	FindWordCounts,
	{{"Count", Order.Descending}}),
  KeptFirstRows = Table.FirstN(SortedRows,100)
in 
  KeptFirstRows

Here they are as a word cloud (yes I know it’s not good dataviz practice, but it’s for fun):

image

You can download the .pbix file with this example in here.

BONUS FACT: another new M function appeared recently too: Function.From. You can read all about it on this thread on the Power Query forum.

The Power Query Branding Problem

A few years ago I started blogging about Power Query. Back then life was simple: I put “Power Query” in the title of a post and everyone knew what I was writing about, because Power Query was an Excel add-in you could download and install. Now, however, the technology has been renamed “Get & Transform” in Excel 2016 and is a native feature of Excel; the name “Power Query” only applies to the add-in for Excel 2010 and 2013. What’s more, the same technology is used in Power BI’s Query Editor and it’s also now in Azure Analysis Services, Analysis Services 2017 Tabular and the Common Data Service. This is obviously a good thing – I think Power Query is one of the best things to come out of Microsoft in the last decade – but it also presents me with a problem. How can I write about this technology if it doesn’t have a single, official, easily identifiable name?

In more recent times I’ve written posts with unwieldy names like “Introduction to Insert Topic Name Here in Power Query/Power BI/Excel 2016 Get & Transform” and in the future I suppose this will have to grow to “Introduction to Insert Topic Name Here in Power Query/Power BI/Excel 2016 Get & Transform/Analysis Services Data Loading/Common Data Service”. Tagging and categorising blog posts can help here, I know, but it’s the title of a blog post that’s the main determining factor as to whether it gets read or not when someone is looking at a list of search results. It’s getting ridiculous, but how else can I ensure that someone searching for the solution to a data loading problem in Excel 2016 Get & Transform will find a post I’ve written that contains the answer but shown in Power BI?

Inside Microsoft I understand that the team that builds this technology is known as the Power Query team. I certainly think about this technology as being called Power Query, as do a lot of other people in the community. However, my argument is that I can’t just use the name “Power Query” when I’m writing or speaking about this technology because most of its users – especially those who are new to it and who need the most help – don’t think of it as “Power Query”. They think of it as Excel 2016 Get & Transform, the Query Editor in Power BI Desktop and so on, the specific instances of it.

Maybe I’m making too big a deal of this, but in my opinion this is a problem not just for me but for Microsoft too. We all know how much developers rely on internet searches to find solutions to problems, and not having a single name for this technology makes it much harder to search successfully. This in turn makes it less likely that when a developer runs into a problem they will be able to solve it, which in turn means they are less likely to want to use this technology in future.

What’s the answer? It has to be to make the “Power Query” brand visible somewhere in the UI of all the products that use Power Query technology. I know there’s a risk of confusing users instead of helping them here (am I using Power Query or Power BI?), but it could be as simple as making a few small changes like renaming the “Query Editor” window to be the “Power Query Editor”:

image

I think that would be enough to let people know that “Power Query” is a technology in its own right and that content referring to “Power Query” is relevant to Excel, Power BI, SSAS and everywhere else that Power Query is used. It would also be nice if, now that M is the official name of the M language (and not Power Query Formula Language), the Advanced Editor window and the Custom Column dialog let users know that the code they were writing in them was in a language called M and not some mysterious, nameless scripting language.

What do you think? I’m interested to hear your comments and opinions…

UPDATE: victory is ours! See this comment from Faisal Mohamood of the Power Query team below
Hey there Chris – what you are saying makes complete sense. Power Query is the name of this capability and we will highlight the name of this capability as such in experiences where you are working with Power Query (and M).

The Diagnostics.ActivityId() M Function

I’ve blogged a few times about the tracing functionality that is built into Power Query/Get&Transform and Power BI (see here and here). The trace files themselves clearly contain a lot of interesting information, but there’s no official documentation about what they contain and the format seems to have changed several times. I guess they are meant for Microsoft internal use but that doesn’t stop us from taking a peek at their contents…

Whenever I’ve looked at the contents of a trace file, one problem I have faced is working out which events relate to the query that I’ve just executed. Today, though, I discovered an M function that can help with this: Diagnostics.ActivityId(). It’s not listed in the online M function reference but here’s the documentation from the function itself:

image

Calling the function does indeed return “an opaque identifier for the currently-running evaluation”. Using the following code in a query:

Diagnostics.ActivityId()

…returns the following:

image

Every time the query is refreshed a different value is returned.

Now, consider the following query that runs a query against a SQL Server database:

let
    Source = Sql.Database("localhost", "Adventure Works DW"),
    Test = Value.NativeQuery(
            Source, 
            "SELECT DISTINCT CalendarYear FROM DimDate")
in
    Test

image

How can you find the trace events that relate to a particular execution of this query? One way is to add a custom column to this query that returns the value returned by Diagnostics.ActivityId():

image

let
    Source = Sql.Database("localhost", "Adventure Works DW"),
    Test = Value.NativeQuery(
            Source, 
            "SELECT DISTINCT CalendarYear FROM DimDate"),
    #"Added Custom" = Table.AddColumn(
	Test, 
	"ActivityId", 
	each Diagnostics.ActivityId())
in
    #"Added Custom"

Then, after the query has been loaded into the Data Set you can copy the ActivityID from the table in the Data pane:

image

…and then search for the same value in the trace files:

image

Of course you’re now on your own trying to make sense of what you find in the trace file, but I hope this helps a little bit!

New M Functionality And Behaviour In Power BI Custom Data Connectors

Over the past few weeks I’ve spent some time playing around with Power BI custom data connectors and while I don’t have anything to share publicly yet (other people are way ahead of me in this respect – see the work of Igor Cotruta, Miguel Escobar and Kasper de Jonge among others) I have learned some interesting things that are worth blogging about.

First of all, the data privacy rules around combining data from different data sources do not apply in custom data connector code. As the docs say here:

Data combination checks do not occur when accessing multiple data sources from within an extension. Since all data source calls made from within the extension inherit the same authorization context, it is assumed they are “safe” to combine. Your extension will always be treated as a single data source when it comes to data combination rules. Users would still receive the regular privacy prompts when combining your source with other M sources.

Those of you who have followed my recent series on this topic, or who have struggled with the Formula.Firewall error, will appreciate how much easier this makes combining data from different sources.

Secondly, you have a lot more flexibility when it comes to different types of authentication for web services. As I showed in my session on web services and M at the Data Insights Summit, there are a lot of limitations when it comes to working with web services in Power BI or Excel. Within a custom data connector, however, you can connect to web services that use OAuth for authentication, you can make POST requests to web services that require authentication and you can pass a web API key from the credentials store through an HTTP custom header and not just through a query parameter – none of which are possible in Power BI or Excel.

I’m sure there are a lot of other useful bits of functionality or behaviour that are only available in custom data connectors – I know I’ve only just begun to learn what’s possible. Even with what I’ve listed here, though, I get the feeling that there will be a lot of cases where you will have no choice but to build a custom data connector just to be able to access certain data sources, even if you only need to create a single report. There may also be cases where it’s preferable to build a custom data connector rather than embed lots of complex M code in a Power BI report or Excel workbook, perhaps to make code portability easier. It’s a bit of a pain to have to have Visual Studio and the SDK installed in order to do this, but building a custom data connector is fairly easy if you already know M and the development experience in Visual Studio (with intellisense!) is much better than in the Advanced Query Editor window.

%d bloggers like this: