Calling The Power BI REST API From Microsoft Flow, Part 2: Refreshing A Dataset When a Data Source Changes

For a while now I’ve had an idea stuck in my head: wouldn’t it be cool to build a Power BI solution where a user could enter data into an Excel workbook and then, as soon as they had done so, they could see their new data in a Power BI report? It would be really useful for planning/budgeting applications and what-if analysis. I had hoped that a DirectQuery model using the CData Excel custom connector (mentioned here) might work but the performance wasn’t good enough; using Flow with the Power BI REST API (see Part 1 of this series for details on how to get this set up) gets me closer to my goal, even if there’s still one major problem with the approach. Here’s how…

First of all, I have two Excel files saved in a OneDrive For Business folder:

image

The contents of each Excel file is more or less the same, a table containing budget data for four quarters:

image

These are the files that my hypothetical users will be editing. I also have a very basic Power BI dataset that imports the data from these two files, and a report to display the data:

image

Now for the Flow. For the trigger I use a OneDrive For Business “When a file is modified” that fires when a file in the above folder is changed:

image

I had to turn on Concurrency Control for the trigger and set the Degree of Parallelism property to 1, to make sure only one instance of the Flow ran at any one time – there’s no point trying to refresh the dataset if another instance of the Flow is already running.

image

Next I have an action from the Power BI custom connector created in my last post. It calls the Power BI API’s Get Refresh History In Group endpoint and gets details of the last refresh of the dataset:

image

This returns a JSON value containing the details of the last refresh; the JSON goes to a Parse JSON action so later on I can get the Status of the last refresh. This action requires a schema, which I generated from a sample payload given in the Power BI REST API documentation. I had to delete endTime from the list of required properties in the schema because it won’t be returned if the most recent refresh is currently in progress:

image

Next there’s an Apply to each to loop over each refresh returned by the API; since only the most recent refresh is returned I know this will only ever execute once.

image

Inside the Apply to each I have a Condition that looks at the Status property of the most recent refresh and checks whether the Status is Completed:

image

If the Status is indeed Completed, then the Flow calls the Refresh Dataset In Group endpoint to refresh the dataset:

image

If the Status of the last refresh is not Completed (either because it failed for some reason, or because there is already a refresh in progress), there is a delay of one minute and the Flow then starts itself again:

image

As I’ve said before, I’m not an expert on Flow but this seems to work reliably with the limited testing I’ve done on it. Here it is in action:

ExcelBudgetDemo

So what is the problem with this approach? It’s that if you are using Power BI Pro you can only refresh a dataset eight times a day and if you are using Power BI Premium you can only refresh a dataset forty-eight times a day. There have been rumours that this forty-eight times a day refresh limit for Premium has been lifted, or might be lifted soon, or doesn’t apply when you refresh via the API, but I specifically tested this and got the following error message: “Invalid dataset refresh request. Number of refresh requests in last 24 hours exceeded limit” (interestingly this didn’t appear as a failed refresh in the refresh history of the dataset). Since it’s reasonable to assume that any Excel file will be updated more than forty-eight times per day in a busy period I don’t think this approach is usable in the real world; all the same I’m happy with this as a proof-of-concept showing what might be possible if the refresh limit is lifted in the future though.

Calling The Power BI REST API From Microsoft Flow, Part 1: Creating A Flow Custom Connector

Playing around with Microsoft Flow recently, I was reminded of the following blog post from a few months ago by Konstantinos Ioannou about using Flow to call the Power BI REST API to refresh a dataset:

https://medium.com/@Konstantinos_Ioannou/refresh-powerbi-dataset-with-microsoft-flow-73836c727c33

I was impressed by this post when I read it, but don’t think I understood quite how many exciting possibilities this technique opens up for Power BI users until I started to use it myself. The Power BI dev team are making a big investment in the API yet most Power BI users, myself included, are not developers and can’t easily write code (or PowerShell scripts) to call the API. With Flow, however, you can use the API without writing any code at all and solve a whole series of  common problems easily. In this series of blog posts I’m going to show a few examples of this.

Before that, though, there is something to add to what Konstantinos says about creating the custom connector. In his post he describes how you can add individual actions to the custom connector, but with Flow you can also create a custom connector from an OpenAPI definition and it turns out that there is an OpenAPI definition file for the Power BI REST API available here. As a result you can create a Flow custom connector with (at the time of writing) a whopping 116 actions by following the instructions in Konstantinos’s post, but instead of using the “Create from blank” option using the “Import an OpenAPI file” option instead and uploading the Power BI REST API OpenAPI definition file. You’ll find that the text for the summaries and descriptions of many of the actions needs cleaning up – in particular, Flow doesn’t allow actions with names that finish with a dot so you’ll have to go through each one and remove that character – but doing this is much faster and easier than adding the actions manually.

image

And one last thing to point out: as Konstantinos mentions in his post, if you get a 404 error while testing the connector, just wait a few minutes!

Testing Power BI Premium Features With Power BI Embedded

It is very difficult for an organisation to decide whether to buy Power BI Premium or to stick with Power BI Pro. Power BI Premium represents a big financial commitment, but how do you know whether the benefits will be worth the costs involved? This was a question that Melissa Coates and I dealt with in great detail in the second version of our white paper “Planning A Power BI Enterprise Deployment”, and I strongly suggest you read the relevant section (pages 37-42) if you are considering buying Premium.

However, it’s one thing to read a white paper and another to actually test Premium yourself – and right now there is no way of trying Premium before you buy it. As more and more features like SSRS paginated reports or linked entities in dataflows get added to Premium, the more likely it is you are going to want to test these features before buying Premium. So what do you do? The answer is to use Power BI Embedded instead. It’s not exactly the same thing as Premium (the setup experience is different, for one thing), but there are three significant facts to point out:

  • I have been assured by several senior Microsoft employees that all new Premium features will be available in both the Premium (EM and P) SKUs and the Embedded (A) SKUs. So, for example, when the SSRS paginated report feature is released it will be available in both Premium and Embedded.
  • The resources available (in terms of the number of v-cores and memory) in the various Premium SKUs mirror those available in the Embedded SKUs. For example a P1 Premium SKU has the same resources available as an A4 Embedded SKU.
  • With Power BI Embedded, unlike Power BI Premium, you only need to pay for what you use: you can pause a Power BI Embedded capacity when you are not using it and pay nothing.

Therefore, to sum up, if you want to test Premium features before you buy, all you need to do is create a new Power BI Embedded capacity in the Azure portal and assign a Workspace to it – and you’ll get access to all the Premium features. When you’ve finished just pause the capacity. You’ll still need to pay while you’re testing but it will be a fraction of the cost of buying Premium.

Can You Use Power Query In Power BI Or Excel To Write Data To A Data Source?

The short answer: no.

The long answer:

It is, technically, possible to use Power Query (in either Power BI or Excel) to write data to a data source; for example, here’s an old post I wrote showing how to run a UPDATE statement to update data in a table in SQL Server. However because the Power Query engine may evaluate a value or statement more than once when a query is executed, it is therefore not safe or supported to use it to try to create, insert, update, delete or otherwise change data in a data source – it should only be used to read data. You may see some functions listed in the M function reference that suggest otherwise but they aren’t supported either, and indeed may not work at all.

Thanks to Matt Masson of the dev team for providing the official position on this.

A Function To Generate The M Code For A Table Type

This is going to sound obscure, and it is, but recently I’ve been using the #table() intrinsic function in M a lot – specifically the version that takes a table type as its first parameter (as I describe here) – and because it’s a bit of a pain to have to write the M code for a table type by hand, I’ve written an M function that takes a table and returns the text for the M code that is needed to define a table type. Here it is:

(InputTable as table) as text =>
let
Source =
Table.Schema(InputTable),
SortRows =
Table.Sort(
Source,
{{"Position", Order.Ascending}}),
RemoveColumns =
Table.SelectColumns(
SortRows,
{"Name", "TypeName"}),
AddCustom =
Table.AddColumn(
RemoveColumns,
"TypeNames",
each
Expression.Identifier([Name]) & " = " & [TypeName]),
Output =
"[" & Text.Combine(AddCustom[TypeNames], ", ") & "]"
in
Output

Nothing complex here, but now I’ve posted this I know that in the future I’ll be able to Google for it when I’m working onsite with a customer and I need it!

To give you an idea of how it works, take the table that is returned by the following M expression, which calls the public TripPin OData web service:

OData.Feed(
"https://services.odata.org/TripPinRESTierService/Airports",
null,
[Implementation="2.0"])
image

Passing this table to the function above returns the following text, the M code for a record that lists the names of the columns in this table and their data types, suitable for use with #table:

[Name = Text.Type, IcaoCode = Text.Type,
IataCode = Text.Type, Location = Record.Type]

Using Power Query And Microsoft Flow To Automate The Creation Of CSV Files

I was super-excited when, a few days ago, the long-awaited integration of Power Query into Microsoft Flow was announced. I was then gutted when work commitments meant I couldn’t blog about it right away and Erik Svensen beat me to it with this excellent post showing how to push data from SQL Server via Power Query and Flow to a Power BI streaming dataset:

https://eriksvensen.wordpress.com/2018/09/25/powerquery-everywhere-now-in-microsoftflow-as-well/

Anyway, the integration of Power Query into Flow opens up so many interesting possibilities so I couldn’t resist writing about it myself today. For example: something I am asked about a lot is whether it is possible to use Power Query to automate the creation of CSV files. You can’t do this in a supported way in Power Query in Excel or Power BI (and yes I know about all of the slightly hacky ways people have done it like this), but it is absolutely something you can do with Power Query in Flow.

Here’s how:

image

First of all I have a Schedule – Recurrence trigger, which allows you to run a flow on a schedule. In this case I’m setting the Flow to run once a day at 6am UK time:

image

Next I have the all-important Transform data using Power Query action. At the moment you can only use Power Query with SQL Server data sources, and (at least when I tested it) it only seems to work with Azure SQL Database data sources and not on-premises SQL Server databases via a gateway – hopefully this gets fixed soon. Clicking the Edit Query button:

image

…opens up the Power Query online editor, which will be familiar to anyone who has used Power Query in Excel or Power BI Desktop, and you build your query:

image

Next I have a Create CSV Table action to convert the results returned by the query to csv format:

image

And finally I take the csv data and save it to a CSV file in OneDrive For Business with a Create File action:

image 

Run the Flow and the CSV file is created:

image

And I can view the contents in Excel Online:

image

The Power Query/Flow integration is still in Preview and I found a few things didn’t work reliably: for example the first few times I ran my Flow I got errors saying that it couldn’t connect to the Azure SQL Database, even though it clearly could while I was designing the query, but that error went away after a while. What’s more it only works for SQL Server data sources right now and I really hope that it is enabled for all the other data sources that Power Query can connect to, especially Excel. These are just teething troubles though, and it’s clear that this is going to be revolutionary for Power Query and Flow users alike!

I really need to improve my Flow skills now…

Converting Decimal Numbers To Hexadecimal In Power Query M

This is a very short post! A lot of people have blogged about how to convert numbers between different bases in M (see for example Maxim Zelensky’s very elegant solution for converting from binary to decimal), but today I noticed there was a very easy way to convert a decimal number to hexadecimal using the Number.ToText() function: you just need to use “x” in the second parameter. For example:

[sourcecode language=”text” padlinenumbers=”true”]
Number.ToText(12, "x") //returns c
Number.ToText(123, "x") //returns 7b
[/sourcecode]

I’m sure this will come in handy somewhere…

Invoking M Functions In Parallel Using List.ParallelInvoke()

I was looking at the list of M functions supported in custom connectors and not in Power BI Desktop (using the technique I blogged about here) in the latest version of the Power Query SDK when I came across an intriguing new function: List.ParallelInvoke(). It doesn’t seem to be documented anywhere, but I think I’ve worked out what it does and it’s very exciting!

Consider the following M function, declared in a custom connector:

[sourcecode language=”text”]
SlowFunction = () as number =>
Function.InvokeAfter(()=>1, #duration(0,0,0,5));
[/sourcecode]

When you call it, it waits 5 seconds and returns the value 1. If you call it three times and sum up the results, as follows:

[sourcecode language=”text”]
List.Sum({SlowFunction(), SlowFunction(), SlowFunction()})
[/sourcecode]

…then after 15 seconds you get the value 3 back.

Now, consider the following expression:

[sourcecode language=”text”]
List.Sum(
List.ParallelInvoke(
{SlowFunction, SlowFunction, SlowFunction}
)
)
[/sourcecode]

When this is evaluated in a custom connector, you get the value 3 back after 5 seconds – so it looks like List.ParallelInvoke() allows you to invoke a list of functions in parallel. There’s also an optional second parameter called concurrency, which seems to control the amount of parallelism. So, for example:

[sourcecode language=”text”]
List.Sum(
List.ParallelInvoke(
{SlowFunction, SlowFunction, SlowFunction},
2
)
)
[/sourcecode]

…returns after 10 seconds, suggesting that only two function calls at a time are invoked in parallel.

I can imagine all kinds of uses for this, for example making multiple parallel calls to data sources or doing expensive calculations in parallel. I wonder if it will ever be allowed to be used outside custom connectors?

UPDATE: see Curt Hagenlocher’s comment below for some important information about this function.

Performance Problems With MDX Calculated Measures That Return Constants In Analysis Services Multidimensional

Recently I was working on an MDX calculation that calculated a value at a low granularity and aggregated the result – basically what I described in my blog post here. Here’s a simplified version of the calculation written on a cube built from Adventure Works data running on my local SSAS 2017 MD instance:

[sourcecode language='text'  padlinenumbers='true']
SCOPE([Measures].[Tax Amount]);
    SCOPE(
	[Customer].[Customer].[Customer].MEMBERS, 
	[Date].[Date].[Date].MEMBERS, 
	[Product].[Product].[Product].MEMBERS);
        THIS = [Measures].[Sales Amount]*0.08;
    END SCOPE;
END SCOPE;
[/sourcecode]

All it does is take the value of the Sales Amount measure at the lowest granularities of the Customer, Date and Product dimensions, multiply it by 0.08 to find a tax value, and because [Tax Amount] is a real, non-calculated measure, the result of the calculation aggregates up through the cube. [I know that I don’t have to aggregate the result of this specific calculation but remember that this is a simplified example – in the real case I did have to write the calculation using Scope statements – and anyway the best way of handling a basic multiplication like this would be with a measure expression]

The performance was sub-second for my test query and I was happy, but then I realised that the same tax rate was being used in other calculations and may change in the future, so I thought I would store the value 0.08 in a calculated measure:

[sourcecode language='text' ]
CREATE MEMBER CURRENTCUBE.[Measures].[Tax Rate] AS  
0.08;

SCOPE([Measures].[Tax Amount]);
    SCOPE(
	[Customer].[Customer].[Customer].MEMBERS, 
	[Date].[Date].[Date].MEMBERS, 
	[Product].[Product].[Product].MEMBERS);
        THIS = [Measures].[Sales Amount]*[Measures].[Tax Rate];
    END SCOPE;
END SCOPE;
[/sourcecode]

Surely a simple change? But no, as soon as I did this my query ran for several minutes and memory usage went through the roof until the query was automatically cancelled:

image

Clearly the SSAS MD Formula Engine could optimise the version with the hard-coded constant value but could not optimise the version with the calculated measure. There was nothing in Profiler to indicate the calculation was being evaluated in cell-by-cell mode though.

So I tried another variation:

[sourcecode language='text' ]
CREATE MEMBER CURRENTCUBE.[Measures].[Tax Rate] AS  
IIF([Measures].[Sales Amount]=0, NULL, 0.08);

SCOPE([Measures].[Tax Amount]);
    SCOPE(
	[Customer].[Customer].[Customer].MEMBERS, 
	[Date].[Date].[Date].MEMBERS, 
	[Product].[Product].[Product].MEMBERS);
        THIS = [Measures].[Sales Amount]*[Measures].[Tax Rate];
    END SCOPE;
END SCOPE;
[/sourcecode]

This time the memory usage was completely flat but the query was still so slow had to be cancelled. Next, I thought I’d try setting the NON_EMPTY_BEHAVIOR property:

[sourcecode language='text' ]
CREATE MEMBER CURRENTCUBE.[Measures].[Tax Rate] AS  
0.08
, NON_EMPTY_BEHAVIOR = [Measures].[Sales Amount];

SCOPE([Measures].[Tax Amount]);
    SCOPE(
	[Customer].[Customer].[Customer].MEMBERS, 
	[Date].[Date].[Date].MEMBERS, 
	[Product].[Product].[Product].MEMBERS);
        THIS = [Measures].[Sales Amount]*[Measures].[Tax Rate];
    END SCOPE;
END SCOPE;
[/sourcecode]

And guess what? The query went back to being sub-second. Strange. Now the NON_EMPTY_BEHAVIOR property is very dangerous and this is a clear example of how it should not be set – the expression 0.08 is never going to return an empty value, regardless of the value of the [Sales Amount] measure. As a result I would be very wary of using this trick in production in case it ended up returning inconsistent results. It’s also worth noting that the following calculation, which is a correct use of NON_EMPTY_BEHAVIOR, is as slow as the other examples above:

[sourcecode language='text'  padlinenumbers='true']
CREATE MEMBER CURRENTCUBE.[Measures].[Tax Rate] AS  
IIF([Measures].[Sales Amount]=0, NULL, 0.08)
, NON_EMPTY_BEHAVIOR = [Measures].[Sales Amount];
[/sourcecode]

Finally I created a dummy table in my DSV with one row and one column to hold the 0.08 value, and then created a real, non-calculated measure from the column. When I used this measure in my calculation then performance of my test query was also sub-second.

So, to sum up, it looks like hard-coding constant values in calculated measures is a bad idea, at least in cases like this, and either using the values themselves in your MDX calculations or creating a table and non-calculated measure specifically to hold the value is better for performance.

A Quick Look Some Power BI And SSAS-Related Products And Books

I  don’t like writing reviews of books or products here on my blog for a couple of reasons, the main one being that I don’t usually have the time to read/test/understand something properly so I can write a thorough review. That said I do get sent a lot of free books and evaluation licences for products that deserve a wider audience, so I thought I would write a post rounding up some of them along with a few thoughts of my own.

Custom Visuals

One of the most interesting questions related to Power BI is whether third-party software companies will be able to build businesses selling extensions to it. The most obvious way that Power BI can be extended is through custom visuals and there are several companies that have paid-for (as opposed to free) custom visuals. Zebra BI is one such company and I’ve been really impressed by what they have produced for visualising financial data:

image

I also saw recently that OKViz (part of the Marco and Alberto/SQLBI family) now have a paid-for version of their excellent Smart Filter visual with some premium features – see here for more details; similarly new features in Klaus Birringer’s Ultimate Waterfall and Ultimate Decomposition Tree visuals are only available in the paid version.

I know many Power BI users who use custom visuals have suffered with various bugs and limitations in functionality over the past few users, and I think buying commercial custom visuals rather than relying on free equivalents is one way of dealing with reliability and support problems. It’s certainly in Microsoft’s interests to have a thriving partner community in this space given that flashy visuals are a major selling point of the product. But will Power BI users want to pay for visuals when so much is available out of the box for free, especially when the cost of the visuals seems relatively high when compared to the overall cost of Power BI? I guess we’ll see.

Custom Connectors

A lot of what I’ve just said about custom visuals also applies to custom connectors, although custom connectors are a lot less mature (at the time of writing, support for custom connectors in the on-premises gateway is still in preview). However I was pleased to see this announcement from CData software that they now have over 100 custom connectors available for Power BI. It looks like what they have done is wrapped their existing ODBC providers, and as a result some of their connectors are for sources that are already available in Power BI, but even so there are a lot of new data sources here.

Incidentally, I got very, very excited when I realised that the CData connectors for Excel and Excel Online supported DirectQuery mode as well as import mode. Why, I hear you ask? Well, just think about a planning/budgeting solution where users can enter data into an Excel spreadsheet and when the numbers change in Excel, the numbers change in Power BI too; think also how this could work with Composite Models. I tried this with CData’s Excel on-premises connector and unfortunately it returned errors when the source Excel worksheet was open; I did get it to work with the Excel Online connector but it was painfully slow, even with a small amount of data. If I can get it to work better (and I may be missing some optimisations within the connector) I’ll blog about it.

Books

One of the few Power BI-related books that have been published recently is Phil Seamark’s “Beginning DAX with Power BI”. He was kind enough to send me a review copy; it’s a good introduction to the subject and I particularly like the way he introduces DAX variables early one. Definitely worth a look if you’re just starting to learn DAX.

I was also sent a copy of a slightly older book, David Parker’s “Mastering Data Visualization with Visio 2016”. David knows pretty much all there is to know about using Visio for BI (his blog is great) and while this book doesn’t cover the most exciting new development in this area – the Visio custom visual for Power BI – if you want to learn all the advanced features of Visio that you could take advantage of in Power BI then this is the book to get.

Other Products

I’ve been a big fan of SentryOne’s SSAS monitoring tool, BI Sentry, for years now but up until recently it only supported SSAS Multidimensional. It now supports SSAS Tabular too (details here), and it looks like SentryOne have done a great job of adapting it to the specific needs of the Tabular engine. I always advise my SSAS customers to invest in some kind of monitoring solution because it makes the job of detecting and solving issues like poor query performance so much easier, and to be honest BI Sentry is better than anything you would be able to build yourself.

Moving onto Power BI, if you need to generate documentation for your Power BI datasets and reports check out Power BI Documenter; the August release looks like it has some cool new features. Alternatively the latest release of Power BI Helper also allows you to generate documentation as well as lots of other useful stuff.

Something that hasn’t been properly released yet, but will be incredibly useful when it is, is MAQ Software’s Application Lifecycle Management Toolkit for Power BI. Closely related to BISM Normalizer, it will allow you to compare two Power BI datasets, merge changes, deploy only parts of a dataset (for example individual measures), and deploy to multiple datasets – all of which are things Power BI developers have been crying out for.

Last of all, the guys at DevScope also have a new(ish) product out, Power BI Robots, which automatically takes screenshots of Power BI reports and dashboards and can deliver them to various destinations such as email address and SharePoint. I haven’t looked at it yet but it seems like it could have a lot of interesting uses.