Testing Power BI Premium Features With Power BI Embedded

It is very difficult for an organisation to decide whether to buy Power BI Premium or to stick with Power BI Pro. Power BI Premium represents a big financial commitment, but how do you know whether the benefits will be worth the costs involved? This was a question that Melissa Coates and I dealt with in great detail in the second version of our white paper “Planning A Power BI Enterprise Deployment”, and I strongly suggest you read the relevant section (pages 37-42) if you are considering buying Premium.

However, it’s one thing to read a white paper and another to actually test Premium yourself – and right now there is no way of trying Premium before you buy it. As more and more features like SSRS paginated reports or linked entities in dataflows get added to Premium, the more likely it is you are going to want to test these features before buying Premium. So what do you do? The answer is to use Power BI Embedded instead. It’s not exactly the same thing as Premium (the setup experience is different, for one thing), but there are three significant facts to point out:

  • I have been assured by several senior Microsoft employees that all new Premium features will be available in both the Premium (EM and P) SKUs and the Embedded (A) SKUs. So, for example, when the SSRS paginated report feature is released it will be available in both Premium and Embedded.
  • The resources available (in terms of the number of v-cores and memory) in the various Premium SKUs mirror those available in the Embedded SKUs. For example a P1 Premium SKU has the same resources available as an A4 Embedded SKU.
  • With Power BI Embedded, unlike Power BI Premium, you only need to pay for what you use: you can pause a Power BI Embedded capacity when you are not using it and pay nothing.

Therefore, to sum up, if you want to test Premium features before you buy, all you need to do is create a new Power BI Embedded capacity in the Azure portal and assign a Workspace to it – and you’ll get access to all the Premium features. When you’ve finished just pause the capacity. You’ll still need to pay while you’re testing but it will be a fraction of the cost of buying Premium.

Can You Use Power Query In Power BI Or Excel To Write Data To A Data Source?

The short answer: no.

The long answer:

It is, technically, possible to use Power Query (in either Power BI or Excel) to write data to a data source; for example, here’s an old post I wrote showing how to run a UPDATE statement to update data in a table in SQL Server. However because the Power Query engine may evaluate a value or statement more than once when a query is executed, it is therefore not safe or supported to use it to try to create, insert, update, delete or otherwise change data in a data source – it should only be used to read data. You may see some functions listed in the M function reference that suggest otherwise but they aren’t supported either, and indeed may not work at all.

Thanks to Matt Masson of the dev team for providing the official position on this.

A Function To Generate The M Code For A Table Type

This is going to sound obscure, and it is, but recently I’ve been using the #table() intrinsic function in M a lot – specifically the version that takes a table type as its first parameter (as I describe here) – and because it’s a bit of a pain to have to write the M code for a table type by hand, I’ve written an M function that takes a table and returns the text for the M code that is needed to define a table type. Here it is:

(InputTable as table) as text =>
let
    Source = 
        Table.Schema(InputTable),
    SortRows = 
        Table.Sort(
            Source,
            {{"Position", Order.Ascending}}),
    RemoveColumns = 
        Table.SelectColumns(
            SortRows,
            {"Name", "TypeName"}),
    AddCustom = 
        Table.AddColumn(
            RemoveColumns, 
            "TypeNames", 
            each 
            Expression.Identifier([Name]) & " = " & [TypeName]),
    Output = 
        "[" & Text.Combine(AddCustom[TypeNames], ", ") & "]"
in
    Output

Nothing complex here, but now I’ve posted this I know that in the future I’ll be able to Google for it when I’m working onsite with a customer and I need it!

To give you an idea of how it works, take the table that is returned by the following M expression, which calls the public TripPin OData web service:

OData.Feed(
"https://services.odata.org/TripPinRESTierService/Airports", 
null, 
[Implementation="2.0"])

image

Passing this table to the function above returns the following text, the M code for a record that lists the names of the columns in this table and their data types, suitable for use with #table:

[Name = Text.Type, IcaoCode = Text.Type, 
IataCode = Text.Type, Location = Record.Type]

Using Power Query And Microsoft Flow To Automate The Creation Of CSV Files

I was super-excited when, a few days ago, the long-awaited integration of Power Query into Microsoft Flow was announced. I was then gutted when work commitments meant I couldn’t blog about it right away and Erik Svensen beat me to it with this excellent post showing how to push data from SQL Server via Power Query and Flow to a Power BI streaming dataset:

https://eriksvensen.wordpress.com/2018/09/25/powerquery-everywhere-now-in-microsoftflow-as-well/

Anyway, the integration of Power Query into Flow opens up so many interesting possibilities so I couldn’t resist writing about it myself today. For example: something I am asked about a lot is whether it is possible to use Power Query to automate the creation of CSV files. You can’t do this in a supported way in Power Query in Excel or Power BI (and yes I know about all of the slightly hacky ways people have done it like this), but it is absolutely something you can do with Power Query in Flow.

Here’s how:

image

First of all I have a Schedule – Recurrence trigger, which allows you to run a flow on a schedule. In this case I’m setting the Flow to run once a day at 6am UK time:

image

Next I have the all-important Transform data using Power Query action. At the moment you can only use Power Query with SQL Server data sources, and (at least when I tested it) it only seems to work with Azure SQL Database data sources and not on-premises SQL Server databases via a gateway – hopefully this gets fixed soon. Clicking the Edit Query button:

image

…opens up the Power Query online editor, which will be familiar to anyone who has used Power Query in Excel or Power BI Desktop, and you build your query:

image

Next I have a Create CSV Table action to convert the results returned by the query to csv format:

image

And finally I take the csv data and save it to a CSV file in OneDrive For Business with a Create File action:

image 

Run the Flow and the CSV file is created:

image

And I can view the contents in Excel Online:

image

The Power Query/Flow integration is still in Preview and I found a few things didn’t work reliably: for example the first few times I ran my Flow I got errors saying that it couldn’t connect to the Azure SQL Database, even though it clearly could while I was designing the query, but that error went away after a while. What’s more it only works for SQL Server data sources right now and I really hope that it is enabled for all the other data sources that Power Query can connect to, especially Excel. These are just teething troubles though, and it’s clear that this is going to be revolutionary for Power Query and Flow users alike!

I really need to improve my Flow skills now…

Converting Decimal Numbers To Hexadecimal In Power Query M

This is a very short post! A lot of people have blogged about how to convert numbers between different bases in M (see for example Maxim Zelensky’s very elegant solution for converting from binary to decimal), but today I noticed there was a very easy way to convert a decimal number to hexadecimal using the Number.ToText() function: you just need to use “x” in the second parameter. For example:

Number.ToText(12, "x") //returns c
Number.ToText(123, "x") //returns 7b

I’m sure this will come in handy somewhere…

Invoking M Functions In Parallel Using List.ParallelInvoke()

I was looking at the list of M functions supported in custom connectors and not in Power BI Desktop (using the technique I blogged about here) in the latest version of the Power Query SDK when I came across an intriguing new function: List.ParallelInvoke(). It doesn’t seem to be documented anywhere, but I think I’ve worked out what it does and it’s very exciting!

Consider the following M function, declared in a custom connector:

SlowFunction = () as number =>
    Function.InvokeAfter(()=>1, #duration(0,0,0,5));

When you call it, it waits 5 seconds and returns the value 1. If you call it three times and sum up the results, as follows:

List.Sum({SlowFunction(), SlowFunction(), SlowFunction()})

…then after 15 seconds you get the value 3 back.

Now, consider the following expression:

List.Sum(
 List.ParallelInvoke(
  {SlowFunction, SlowFunction, SlowFunction}
 )
)

When this is evaluated in a custom connector, you get the value 3 back after 5 seconds – so it looks like List.ParallelInvoke() allows you to invoke a list of functions in parallel. There’s also an optional second parameter called concurrency, which seems to control the amount of parallelism. So, for example:

List.Sum(
 List.ParallelInvoke(
  {SlowFunction, SlowFunction, SlowFunction},
  2
 )
)

…returns after 10 seconds, suggesting that only two function calls at a time are invoked in parallel.

I can imagine all kinds of uses for this, for example making multiple parallel calls to data sources or doing expensive calculations in parallel. I wonder if it will ever be allowed to be used outside custom connectors?

UPDATE: see Curt Hagenlocher’s comment below for some important information about this function.

Performance Problems With MDX Calculated Measures That Return Constants In Analysis Services Multidimensional

Recently I was working on an MDX calculation that calculated a value at a low granularity and aggregated the result – basically what I described in my blog post here. Here’s a simplified version of the calculation written on a cube built from Adventure Works data running on my local SSAS 2017 MD instance:

SCOPE([Measures].[Tax Amount]);
    SCOPE(
	[Customer].[Customer].[Customer].MEMBERS, 
	[Date].[Date].[Date].MEMBERS, 
	[Product].[Product].[Product].MEMBERS);
        THIS = [Measures].[Sales Amount]*0.08;
    END SCOPE;
END SCOPE;

All it does is take the value of the Sales Amount measure at the lowest granularities of the Customer, Date and Product dimensions, multiply it by 0.08 to find a tax value, and because [Tax Amount] is a real, non-calculated measure, the result of the calculation aggregates up through the cube. [I know that I don’t have to aggregate the result of this specific calculation but remember that this is a simplified example – in the real case I did have to write the calculation using Scope statements – and anyway the best way of handling a basic multiplication like this would be with a measure expression]

The performance was sub-second for my test query and I was happy, but then I realised that the same tax rate was being used in other calculations and may change in the future, so I thought I would store the value 0.08 in a calculated measure:

CREATE MEMBER CURRENTCUBE.[Measures].[Tax Rate] AS  
0.08;

SCOPE([Measures].[Tax Amount]);
    SCOPE(
	[Customer].[Customer].[Customer].MEMBERS, 
	[Date].[Date].[Date].MEMBERS, 
	[Product].[Product].[Product].MEMBERS);
        THIS = [Measures].[Sales Amount]*[Measures].[Tax Rate];
    END SCOPE;
END SCOPE;

Surely a simple change? But no, as soon as I did this my query ran for several minutes and memory usage went through the roof until the query was automatically cancelled:

image

Clearly the SSAS MD Formula Engine could optimise the version with the hard-coded constant value but could not optimise the version with the calculated measure. There was nothing in Profiler to indicate the calculation was being evaluated in cell-by-cell mode though.

So I tried another variation:

CREATE MEMBER CURRENTCUBE.[Measures].[Tax Rate] AS  
IIF([Measures].[Sales Amount]=0, NULL, 0.08);

SCOPE([Measures].[Tax Amount]);
    SCOPE(
	[Customer].[Customer].[Customer].MEMBERS, 
	[Date].[Date].[Date].MEMBERS, 
	[Product].[Product].[Product].MEMBERS);
        THIS = [Measures].[Sales Amount]*[Measures].[Tax Rate];
    END SCOPE;
END SCOPE;

This time the memory usage was completely flat but the query was still so slow had to be cancelled. Next, I thought I’d try setting the NON_EMPTY_BEHAVIOR property:

CREATE MEMBER CURRENTCUBE.[Measures].[Tax Rate] AS  
0.08
, NON_EMPTY_BEHAVIOR = [Measures].[Sales Amount];

SCOPE([Measures].[Tax Amount]);
    SCOPE(
	[Customer].[Customer].[Customer].MEMBERS, 
	[Date].[Date].[Date].MEMBERS, 
	[Product].[Product].[Product].MEMBERS);
        THIS = [Measures].[Sales Amount]*[Measures].[Tax Rate];
    END SCOPE;
END SCOPE;

And guess what? The query went back to being sub-second. Strange. Now the NON_EMPTY_BEHAVIOR property is very dangerous and this is a clear example of how it should not be set – the expression 0.08 is never going to return an empty value, regardless of the value of the [Sales Amount] measure. As a result I would be very wary of using this trick in production in case it ended up returning inconsistent results. It’s also worth noting that the following calculation, which is a correct use of NON_EMPTY_BEHAVIOR, is as slow as the other examples above:

CREATE MEMBER CURRENTCUBE.[Measures].[Tax Rate] AS  
IIF([Measures].[Sales Amount]=0, NULL, 0.08)
, NON_EMPTY_BEHAVIOR = [Measures].[Sales Amount];

Finally I created a dummy table in my DSV with one row and one column to hold the 0.08 value, and then created a real, non-calculated measure from the column. When I used this measure in my calculation then performance of my test query was also sub-second.

So, to sum up, it looks like hard-coding constant values in calculated measures is a bad idea, at least in cases like this, and either using the values themselves in your MDX calculations or creating a table and non-calculated measure specifically to hold the value is better for performance.

%d bloggers like this: