Excel Cube Functions, Dynamic Arrays And Lambdas, Part 1: Getting All The Items From A Set

After my recent post on using Office Scripts and cube functions to generate Excel reports from Power BI data, Meagan Longoria asked me this question on Twitter:

To which I can only reply: guilty as charged. I have always loved the Excel cube functions even though they are maybe the least appreciated, least known and least used feature in the whole Microsoft BI stack. They have their issues (including sometimes performance) but they are great for building certain types of report in Excel that can’t be built any other way.

Anyway, the recent addition of new Lambda helper functions to Excel has made me particularly happy because they can be used with cube functions to overcome some limitations that have existed since cube functions were first released in Excel 2007, and to do some other cool things too. In this series of posts I’m going to explore some of the things they make possible.

Let’s start with something simple. In Excel, the CubeSet function can be used to return an (MDX) set of items. This set is stored in a single cell, though, and to extract each item into a cell on your worksheet you need to use the CubeRankedMember function. For example, let’s say I have a table called Sales on my worksheet:

…that is then loaded into the Excel Data Model (aka Power Pivot – although this works exactly the same if I use a Power BI dataset, Azure Analysis Services or SQL Server Analysis Services as my source):

What you can then do is use the CubeSet function to create a set of all the products like so:

=CUBESET("ThisWorkbookDataModel", "[Sales].[Product].[Product].MEMBERS", "Product Set")

…and then use the CubeRankedMember function to put each individual item of the set into a cell. Here’s a simple example worksheet, first with the formulas showing and then the results:

This example shows the fundamental problem that has always existed with CubeRankedMember though: in order to show all the items in a set you need to know how many items there are in advance, and populate as many cells with CubeRankedMember formulas as there are items. In this case see how the range B4:B6 contains the numbers 1, 2 and 3; these numbers are used in the formulas in the range C4:C6 to get the first, second and third items in the set.

If a fourth product was added to the table, however, it would not appear automatically – you would have to add another cell with another CubeRankedMember formula in it manually. I’ve seen some workarounds but they’re a bit hacky and require you to know what the maximum possible number of items in a set could ever be. Indeed that’s always been one of the key differences between cube functions and PivotTables: cube functions are static whereas PivotTables can grow and shrink dynamically when the data changes.

The new MakeArray function in Excel provides a really elegant solution to this problem: you can now write a single formula that returns a dynamic array with all the items in the set in. Assuming that the same CubeSet exists in B2 as shown above, you can do the following:

=MAKEARRAY(CUBESETCOUNT($B$2), 1, LAMBDA(r,c,CUBERANKEDMEMBER("ThisWorkbookDataModel",Sheet3!$B$2,r)))

Here’s the output:

Notice how the formulas in cell B4 returns an array that contains all three items in the set into the range B4:B6.

How does this work?

  1. The CubeSetCount function is used to get the number of items in the CubeSet in B2.
  2. The MakeArray function is then used to create an array with the number of rows returned by CubeSetCount and one column
  3. In the third parameter of MakeArray the Lambda function is used to return a function that wraps CubeRankedMember, which is then called with the current row number of the array

The nice thing about this is that when more products are added to the Sales table they automatically appear in the output of the MakeArray formula in B4. So, for example, with two more products added to the Sales table like so:

Here’s the new output of the formula, showing the two new products returned in the array automatically:

This is not very useful on its own though. In my next post I’ll show you how this can be used to build a simple report.

Generating Excel Reports Using Power BI, Cube Formulas, Office Scripts And Power Automate

Now that Excel reports connected to Power BI datasets work in Excel Online it opens up a lot of new possibilities for doing cool things with Office Scripts and Power Automate. Here’s a simple example showing how all these technologies can be put together to automatically generate batches of Excel reports from a template.

Step 1: Create a template report in Excel using cube formulas

In Excel on the desktop I created a new Excel file, created a connection to a Power BI dataset and then built a simple report using Excel cube formulas:

Here are the Excel formulas for the table on the left:

This report uses data from the UK’s Land Registry (one of my favourite data sources) and shows the average price paid and number of sales broken down by property type for a single county (specified in cell B2 of this report – in the screenshot above data for Bournemouth is shown). Here’s the formula in B2:

=CUBEMEMBER("Price Paid", "[Property Transactions].[County].[All].[BOURNEMOUTH]")

This formula is referenced by all the CUBEVALUE formulas in the body of the table so they are all sliced by the selected county.

After doing this, I saved the file to OneDrive for Business.

Step 2: Create an Office Script to change the county shown in cell B2

The aim of this exercise is to generate one copy of the report above for each county in a list of counties, so the next thing I did was create a parameterised Office Script that takes the name of a county and changes the county name used in the formula in cell B2. To do this I opened the Excel report in Excel Online, started the script recorder, changed the formula in B2 and then stopped recording. I then edited this script to take a parameter for the county name (called county) to use in the formula. Here’s the script:

function main(workbook: ExcelScript.Workbook, county: string) {
	let selectedSheet = workbook.getActiveWorksheet();
	// Set range B2 on selectedSheet
	selectedSheet.getRange("B2").setFormulaLocal("=CUBEMEMBER(\"Price Paid\", \"[Property Transactions].[County].[All].[" + county + "]\")");
}

Step 3: Create a list of counties to pass to the script

Next, I created a second Excel workbook containing a table that contained the county names to pass to the script and saved this to OneDrive for Business too:

Step 4: Create Power Automate flow to call the script once for each county in the Excel table

Last of all, I created a Power Automate flow that reads the county names from the table in the previous step, runs the script for each county, creates a copy of the original Excel report after each script run and then saves it to a folder. Here’s the flow at a high level:

In more detail, here’s the setup for the ‘List rows present in a table’ action:

Here’s the ‘Run script’ action:

Here’s the expression used to get the current county name in the loop:

items('CountyLoop')?['Counties']

…and here’s the expression used to create the destination file path:

concat('/Office Scripts Reporting/Reports/',items('CountyLoop')?['Counties'],'.xlsx')

Running this flow results in three Excel workbooks being created, one for each county with the county name in the workbook name, stored in a folder like so:

Here’s the report in BATH AND NORTH EAST SOMERSET.xlsx:

Of course I could do other things at this point like email these workbooks to different people, but there’s no need to overcomplicate things – I hope you’ve got the idea.

A few last points to make:

  • Office Scripts don’t seem to work with PivotTables connected to Power BI datasets yet – I’m sure it’s just a matter of time before they do though
  • How is this different from using Power Automate to call the Power BI export API? A paginated report can be exported to Excel but this method gives you a lot more flexibility because it allows you to use a lot more Excel functionality, not jus the functionality that paginated reports can use in its exports. It also gives you a report that is connected live back to a dataset using cube functions, not static data.
  • Generating large numbers of Excel reports like this is not something I like to encourage – why not view your report in the Power BI portal, especially now you can view live Excel reports connected to datasets there too? – but I know it’s something that customers ask for .
  • I haven’t done any performance testing but I suspect that this method may be faster than using the Power BI export API in Power Automate.

Ten Reasons Why I’m Excited About The New Power BI/Excel Integration Features

My favourite Power BI announcement at the Microsoft Business Applications Summit was, without a doubt, that Excel PivotTables connected to Power BI datasets will very soon work in the browser and not just on the desktop. This is something I have wanted for a long time, way before I joined Microsoft, so this is a feature I have a personal interest in. However I also think it’s an incredibly important step forward for Power BI in general and in this post I’ll outline the reasons why.

Before we carry on please make sure you read this post on the Excel blog which has more details on all of the new Power BI/Excel integration features that are being released. Quick summary: if you’re reading this in late May 2021 you probably won’t have all of this functionality available in your tenant yet but it is coming very soon.

So why exactly am I excited?

It makes Excel a third option for building Power BI reports

Up to now, if you wanted to build Power BI reports and share them with other people online you had two choices: regular Power BI reports and paginated reports. Now Excel gives you a third option: you can upload Power BI-connected Excel workbooks to a Power BI workspace, make them available via a Power BI app, and not only will they be fully interactive but the data in them will also update automatically when the data in your dataset updates.

PivotTables are the best way to explore Power BI data

Why do we need Excel as an option for building reports on data stored in Power BI? The first reason is data exploration. Excel PivotTables are a much better way to explore your data than any other method in Power BI, in my opinion. Why try to recreate an Excel PivotTable using a matrix visual in a regular Power BI report when you can give your users the real thing?

Cube functions also now work in the browser – and they make it easy to design financial reports

The Excel cube functions (CubeMember, CubeValue etc) are, I think, the best-kept feature in Excel. While PivotTables are great for exploring data they aren’t always so great when you want to build highly-formatted reports. The Excel cube functions make it easy to bind individual cells in a worksheet to individual values in your dataset and because they’re just like any other Excel function they allow you to use all of Excel’s formatting and charting functionality. This then makes it possible to build certain types of report, such as financial reports, much more easily. If you want to learn more about them check out this video from Peter Myers – it shows how to use them with Analysis Services but they work just the same when connected to a Power BI dataset.

Organisational data types make it easy to access Power BI data

While the Excel cube functions are very powerful they are also somewhat difficult to use and sometimes suffer from performance problems. The new organisational data types in Excel do something very similar and while they don’t yet have all the features you need to build complex reports they are also a lot easier to understand for most business users.

Excel formulas are easier than DAX for a many calculations

Everyone knows DAX can be hard sometimes. However, once you’ve got the data you need from your dataset into Excel using a PivotTable, cube functions or organisational data types you can then do your own calculations on that data using regular Excel formulas. This not only allows business users to add their own calculations easily but for BI professionals it could be the case that an Excel formula is easier to write and faster to execute than the equivalent DAX.

Excel can visualise data in ways that Power BI can’t

Excel is a very mature data visualisation tool and it has some types of chart and some formatting options that aren’t (yet) available in Power BI’s native visuals. One example that springs to mind is that you can add error bars to a bar chart in Excel; another is sparklines, although they are coming to Power BI later this year.

Power Pivot reports will also work in the browser

Even if you don’t have a Power BI pro licence, if you have a commercial version of Excel you’ll have Power Pivot and the Excel Data Model. And guess what, Power Pivot reports also now work in the browser!

Collaborate in real-time with your colleagues in Excel Online

With Excel reports connected to Power BI stored in OneDrive for Business or a SharePoint document library you get great features for collaboration and co-authoring, so you and your colleagues can analyse data together even if you’re not in the same room.

There’s a lot of other cool stuff happening in Excel right now

The Excel team are on a hot streak at the moment: dynamic arrays, LAMBDAs, LET, the beginnings of Power Query on the Mac and lots more cool new stuff has been delivered recently. If you’re only familiar with the Excel features you learned on a course 20 years ago you’re missing out on some really powerful functionality for data analysis.

Everyone knows Excel!

Last of all, it goes without saying that Excel is by far the most popular tool for working with data in the world. Everyone has it, everyone knows it and everyone wants to use it. As Power BI people we all know how difficult it is to persuade our users to abandon their old Excel habits, so why not meet them halfway? Storing data in a Power BI dataset solves many of the problems of using Excel as a reporting tool: no more manual exports, old-of-date data or multiple versions of the truth. Using Excel to build reports on top of a Power BI dataset may be much easier to learn and accept for many business users – at least at first – than learning how to build reports in Power BI Desktop.

Power BI, Excel Organisation Data Types And Images

Excel Organisation data types were released last year (see here for details), but did you know that you can now use them to bring images as well as text and numbers into Excel? Here’s a super-simple example that shows you how to do this.

Here’s a table called ‘Fruit With Image’ in a dataset that I have published to the Power BI Service:

Notice that the Data Category property on the Image column, which contains the URL of a picture of each type of fruit listed, to “Image URL” (for more details on what this does see here). If I use this table in a Power BI report, I see the name of each fruit and a picture:

So far no surprises. I can also set this table up as a Featured Table (for more details see here) so it can be used as the source for an Organisation Data Type in Excel:

The cool thing is that when I type these fruit names into Excel and mark them as the “Fruit With Image” data type (see here for more details), I can then access the Image field and it will show the image that the URL points to inside a cell:

New Option To Solve Problems With Power Query Not Loading All Data From An Excel Worksheet

Some interesting new Power Query functionality was released in the December release of Power BI Desktop (it’s also in the build of Excel I’m running) that didn’t get announced in the blog post but which will be useful for anyone using Excel as a data source. It is fully documented but I thought I’d blog about it anyway to raise awareness.

It’s a new option on the Excel.Workbook function called InferSheetDimensions. Here’s what the docs for the Excel.Workbook function say:

Can be null or a logical (true/false) value indicating whether the area of a worksheet that contains data should be inferred by reading the worksheet itself, rather than by reading the dimensions metadata from the file. This can be useful in cases where the dimensions metadata is incorrect. Note that this option is only supported for Open XML Excel files, not for legacy Excel files. Default: false.

There’s also a much more detailed explanation on this page:

https://docs.microsoft.com/en-us/power-query/connectors/excel#troubleshooting

…in the section “Missing or incomplete Excel data” on how to troubleshoot issues where Power Query isn’t loading all the data from your worksheet.

Here’s a quick summary of the problem. Open XML format Excel workbooks (ie xlsx and xlsm format files) have some metadata (the “dimensions metadata” referred to in the docs) in them that acts as a kind of bounding box describing the range of cells in each worksheet that contain values. By default, Power Query uses this metadata when it gets data from an Excel worksheet – although not when it gets data from an Excel table or named range. Most of the time this metadata is reliable but in some cases, for example where an xlsx file has been generated by third-party software that has “export to Excel” functionality, it contains inaccurate information. This doesn’t affect Excel, which still loads the workbook, but it means that Power Query may not load all the data present in a worksheet. In these cases, setting InferSheetDimensions=true in Excel.Workbook causes Power Query to ignore the dimensions metadata and load all the data from the worksheet.

The docs also point out that if the dimension metadata returns a range that much larger than the actual populated range of cells in the worksheet, this can cause performance problems when loading data. While this can be fixed by manually editing the workbook as shown here, that’s not much help if you’re dealing with Excel files generated by buggy third-party software. Since Excel is already one of the slowest data sources you can use with Power Query, I guess this is another reason to use text formats like CSV with Power Query instead.

The M Behind The New Power Query Data Types In Excel

The big news this week – at least for me – was the release of the new Power Query data types to the Excel insiders channel. You can read all about it here:

https://insider.office.com/en-us/blog/power-query-data-types-in-excel

They’re the latest manifestation of Excel linked data types; cool things are also happening with them and Power BI featured tables too.

The announcement blog post explains pretty much everything you can do right now with Power Query data types but I was curious about the M code that is used to create them. Here’s an example query that takes this source table:

…and creates this Power Query data type:

let
  Source = #table(
      type table[
        Fruit = text, 
        Colour = text, 
        Sales = number
      ], 
      {
        {"Apples", "Green", 10}, 
        {"Lemons", "Yellow", 20}, 
        {"Strawberries", "Red", 30}
      }
    ),
  #"Created data type"
    = Table.CombineColumnsToRecord(
        Source, 
        "Data type", 
        {"Fruit", "Colour", "Sales"}, 
        [DisplayNameColumn = "Fruit", TypeName
          = "Excel.DataType"]
      )
in
  #"Created data type"

The magic happens with the #”Created data type” step and the Table.CombineColumnsToRecord function; so Power Query data types are basically columns that contain record values with (I guess, I need to check) some extra metadata.

Power Query Geography And Geometry Functions In Power BI And Excel

In the August 2020 release of Power BI Desktop a couple of new Power Query functions were added: Geography.FromWellKnownText, Geography.ToWellKnownText, GeographyPoint.From, Geometry.FromWellKnownText, Geometry.ToWellKnownText and GeometryPoint.From. These functions (which are coming soon to Power Query in Excel too), make it easier to work with geographic and geometric data in the Well Known Text format. You can have all kinds of fun with these functions if you have a visual (like the Icon Map custom visual) that can display Well Known Text data, but I’ll leave that kind of thing for future blog posts. In this post I’ll explain how the basics of how the functions actually work.

Let’s start with points. The Geography.FromWellKnownText and Geometry.FromWellKnownText functions convert Well Known Text values into records, which makes it simple to do things like extract latitude and longitude values without having to do complex parsing of the text itself. For example:

Geometry.FromWellKnownText("POINT (0.1 0.2)")

Returns a record that looks like this:

Whereas the following:

Geography.FromWellKnownText("POINT (0.1 0.2)")

…returns a similar record but one with fields for latitude and longitude rather than x and y:

Now you know what the record format of a point looks like it’s easy to go in the other direction and convert a record into Well Known Text format. For example the M expression:

Geography.ToWellKnownText([Kind="POINT", Longitude=0.1, Latitude=0.3])

returns the text value

POINT(0.1 0.3)

You can also generate a point by passing a longitude and latitude (and other optional parameters) to GeographyPoint.From and GeometryPoint.From. For example:

GeographyPoint.From(0.1, 0.3)

…also returns a POINT record:

Points can then be used to build more complex objects, which in turn can be combined into more complex objects still. For example the following M query takes three points and creates a LINESTRING object that joins those three points into a line:

let
    Point1 = GeographyPoint.From(0,0),
    Point2 = GeographyPoint.From(1,1),
    Point3 = GeographyPoint.From(0,1),
    PointList = {Point1,Point2,Point3},
    LineRecord = [Kind="LINESTRING", Points=PointList],
    WKTLineString = Geography.ToWellKnownText(LineRecord)
in
    WKTLineString

The output is this:

LINESTRING(0 0, 1 1, 0 1)

Similarly, multiple LINESTRING objects can be combined into a MULTILINESTRING. For example:

let
    Point1 = GeographyPoint.From(0,0),
    Point2 = GeographyPoint.From(1,1),
    PointList1 = {Point1,Point2},
    Point3 = GeographyPoint.From(0,1),
    Point4 = GeographyPoint.From(1,0),
    PointList2 = {Point3, Point4},
    LineRecord1 = [Kind="LINESTRING", Points=PointList1],
    LineRecord2 = [Kind="LINESTRING", Points=PointList2],
    LineRecordList = {LineRecord1,LineRecord2},
    MultiLineRecord = [Kind="MULTILINESTRING", Components=LineRecordList],
    WKTMultiLineString = Geography.ToWellKnownText(MultiLineRecord)
in
    WKTMultiLineString

…returns:

MULTILINESTRING((0 0, 1 1), (0 1, 1 0))

Finally, the most useful application for this is to extract latitudes and longitudes or x and y co-ordinates from a column of values in WKT format, something like this:

All you need to do to extract latitude and longitude values from these points is add a custom column as a step with the expression

Geography.FromWellKnownText([MyPoints])

The result is a column of records which can be expanded to get latitude and longitude values:

Adding Tooltips To Column Headers In The Power Query Editor In Power BI And Excel

Here’s something that will excite all you M nerds out there! Did you know you can make tooltips appear when you hover over column headers in a table in the Power Query Editor in either Power BI Desktop or Excel?

It’s all done with M metadata, and here’s an example query:


let
source =
#table({"firstname", "lastname"}, {{"matt", "masson"}}),
tableType =
type table[firstname = Text.Type, lastname = Text.Type]
meta [
Documentation.FieldDescription =
[firstname = "Given Name", lastname = "Family Name"]
],
replaceType = Value.ReplaceType(source, tableType)
in
replaceType

Here’s what you see when your mouse hovers over the firstname column in the Power Query Editor:

Capture1

…and here’s what you see when your mouse hovers over the lastname column:

Capture2

How does this work? Here’s what each of the steps do:

  • The source step creates a simple table with two columns called firstname and lastname using #table (see here for more details on that).
  • The tabletype step declares a new table type with two columns (the same two columns in the table from the previous step) and then adds a metadata record to this type. In that record the Documentation.FieldDescription field contains the text values that will appear as tooltips when you hover over each column.
  • The replacetype step replaces the type of the table returned by source with the type declared in tabletype.

The Power Query Editor UI then looks for a Documentation.FieldDescription field in any metadata associated with a table and displays the values in that field when you hover over the appropriate column.

Note that if you add any steps to your query after this that change the table type (for example that add or remove columns), the metadata is removed ☹. That said I still feel like this might be a useful feature for anyone building a custom connector, for example.

[Thanks to Matt Masson for telling me how all this works]

 

 

Video: Power BI Data Privacy Settings Deep Dive

Over the past year or so I’ve been delivering a presentation on the Power Query engine’s data privacy settings at various conferences and Power BI user groups, in an attempt to try to pull together all the knowledge I have on this complex topic. Luckily, when I presented this session at the London Power BI User Group recently, they recorded it and posted it on YouTube here:

If you’re struggling with data privacy errors like:

Formula.Firewall: Query ‘Query1’ (step ‘xyz’) is accessing data sources that have privacy levels which cannot be used together. Please rebuild this data combination.

or

Formula.Firewall: Query ‘Query1’ (step ‘xyz’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.

…in either the Power Query Editor in Power BI or Power Query/Get&Transform in Excel, then I hope this video will help you understand why you’re getting these errors and what you can do to avoid them.

A lot of what I show in this session draws on other material, such as:

  • My five-part series of posts on the Power Query data privacy settings that starts here
  • My post on how credentials and data privacy settings are stored for dynamic data sources here
  • My post here on the performance overhead of applying data privacy checks
  • Ehren von Lehe’s detailed paper on (available here) on how the engine partitions data sources while applying data privacy checks. One of the things I show in the video is that it’s now possible to see these partitions using Power Query Query Diagnostics (see here for some details – although I should probably devote a whole blog post to this in the future)

Lastly, one minor correction to something I said in the video: at the 44:32 mark I do a demo that shows how you can use M functions to avoid a Formula Firewall error. While this trick works in Power BI Desktop and Excel, it results in a dataset that can’t be refreshed in the Power BI Service unfortunately.

 

Power Query Formatter

If you’ve ever used DAX Formatter to format your DAX code and wondered why there isn’t an equivalent for Power Query/M, then wonder no more: the nice people at Power Pivot Insights in Germany (read their German-language blog here) have built one. You can find it here:

https://powerqueryformatter.com/

PQ Formatter

They also have an API you can use too.

Bonus information: did you know that Microsoft also has an open source parser for M, available here?

%d bloggers like this: