Stopping Some Users Seeing Certain Columns Or Measures In Your Power BI Report With Object Level Security And Field Parameters

If you have sensitive data in your Power BI dataset you may need to stop some users seeing the data in certain columns or measures. There is only one way to achieve this: you have to use Object Level Security (OLS) in your dataset. It’s not enough to exclude those measures or columns from your reports or to hide them, because there will always be ways for enterprising users to see data they shouldn’t be allowed to see. However the problem with OLS up to now is that it didn’t play nicely with Power BI reports and so you had to create multiple versions of the same report for different security roles. The good news is that there’s now a way to create one report connected to a dataset with OLS and have it display different columns and measures to users with different permissions.

Let’s say you have a dataset and report that looks like this:

As you can see, it displays the names and addresses of employees along with sales and bonus data.

Now let’s say that the address and bonus data should only be visible to HR and everyone else should only be able to see the names and sales values. As I said, the only way to achieve this is to create a role that uses OLS to deny access to the address and bonus columns. Gilbert Quevauvilliers has a great post showing how to set up OLS using Tabular Editor here so I won’t go into detail about how to do this, but here’s how I configured the role in Tabular Editor 3:

If you publish the report and test the role in the browser, you’ll see that you get a “The visual has unrecognized fields” error because the table in the report uses the Address and Bonus fields which, of course, the user cannot access because of the OLS:

Security is working as expected but wouldn’t it be great if, instead of seeing an error here, you could build a single report that displays all the fields when the user has permission to see them and only displays the Name and Sales fields to users who are members of the role with OLS applied?

Well, now you can thanks to the new field parameters feature. The intended use of field parameters is to enable the end users of your reports to choose the fields displayed in visuals using a slicer. Behind the scenes when you create a field parameter a table is added to your dataset with one row for each field you have chosen; this effectively makes the fields used in a visual data-driven, and you can use Row Level Security (RLS) on the table created for your field parameter to control which fields are displayed in your visual and solve the problem.

Going back to our report, the next step is to create a new field parameter with all the columns and measures used in the table:

Notice that the “Add slicer to this page” checkbox is deselected because you don’t need a slicer on the report here. Here’s what the table created for the field parameter looks like:

With the field parameter created it can be used instead of the individual fields in the table definition:

You can then edit the role that already has OLS in it to apply RLS on the field parameter table, so only the rows for the fields that are allowed by the OLS are returned:

'FieldsForMyTable'[FieldsForMyTable] = "Name" ||
'FieldsForMyTable'[FieldsForMyTable] = "Sales"

It’s important that the OLS and RLS are defined in the same role because of the restriction on combining OLS and RLS from different roles.

With all this done, when you view the report through the role you only see Name and Sales displayed:

It’s important to stress that the OLS is still securing the data here – the RLS is just preventing the errors.

One downside of this technique is that things could get complicated if you have multiple visuals that need to display different combinations of secured and non-secured fields in a report. There could also be a performance penalty: when a visual uses a field parameter an extra DAX query is run on the field parameter table to determine the fields to display, and while these queries should be extremely fast most of the time there’s always a risk that they somehow slow your report down.

In conclusion, this workaround isn’t ideal but I think it’s the best way to work with OLS in Power BI reports that’s possible at the moment.

[Thanks to John Vulner for background information on how field parameters work]

I’m Posting On The Power Query Blog Too!

In the future you’re going see me writing blog posts on the official Power Query blog as well as here on my own personal blog, and indeed the first of these posts went live a few hours ago. It’s on a new M function called Table.StopFolding which, as the name suggests, stops query folding taking place:

https://powerquery.microsoft.com/en-us/blog/stop-query-folding-with-table-stopfolding/

I’m doing this a) because I was asked very nicely by the Power Query team if I could help out, and b) because it doesn’t make sense for announcements about new Power BI or Power Query functionality, however obscure, to be made on my own personal blog rather than on an official product blog. This isn’t going to affect the number of posts here though.

Building A Reporting Solution Using Excel Power Query – Where Are We Now?

Seven years ago I gave a presentation at SQLBits called “Building a reporting solution using Power Query”. You can watch the recording here:

https://sqlbits.com/Sessions/Event14/Building_A_Reporting_Solution_Using_Power_Query

In it I showed how you could build a simple reporting solution using just Excel and Power Query, loading data into tables, handling parameterisation, making sure you get the best performance and so on. I think the session holds up pretty well: the functionality I showed hasn’t changed at all, and while in the meantime Power BI has reinvented itself and taken over the world I still think there’s a strong argument for using Excel plus Power Query instead of Power BI for some reporting scenarios (although it may be heresy to say so…).

If you follow the Excel blog you’ll know there have been a number of exciting announcements in the last few months, so I thought it would be interesting to take a look at some of them and consider the impact they have for BI and reporting use cases.

Power Query in Excel for the Mac

One of the priorities for the Excel Power Query team has been to get Power Query working in Excel on the Mac, and in the latest update we now have the Power Query Editor available. Data sources are still limited to files (CSV, Excel, XML, JSON), Excel tables/ranges, SharePoint, OData and SQL Server but they are some of the most popular sources. I’m not a Mac person so this doesn’t excite me much, but this does open up Power Query to a new demographic that has traditionally ignored Microsoft BI; for example, I was leafing through John Foreman’s excellent introductory data science book “Data Smart” recently and all the examples in it are in Excel to reach a mass audience, but… Excel for the Mac.

Power Query in Excel Online

This, on the other hand, is something I do care about: who cares what OS you’re running if you can do everything you need in the browser? Well now you can refresh Power Query in Excel Online, although again only a few data sources are supported at the moment: data in tables/ranges in the current workbook, or anonymous OData feeds. More data sources will be supported in the future and there will also be better integration with Office Scripts, so you’ll be able to refresh queries from Power Automate or via a button without needing VBA; you’ll also be able use the Power Query Editor in the browser too.

Before you get too excited about Power Query in Excel Online, though, remember one important difference between it and a Power BI report or a paginated report. In a Power BI report or a paginated report, when a user views a report, nothing they do – slicing, dicing, filtering etc – affects or is visible to any other users. With Power Query and Excel Online however you’re always working with a single copy of a document, so when one user refreshes a Power Query query and loads data into a workbook that change affects everyone. As a result, the kind of parameterised reports I show in my SQLBits presentation that work well in desktop Excel (because everyone can have their own copy of a workbook) could never work well in the browser, although I suppose Excel Online’s Sheet View feature offers a partial solution. Of course not all reports need this kind of interactivity and this does make collaboration and commenting on a report much easier; and when you’re collaborating on a report the Show Changes feature makes it easy to see who changed what.

More flexibility with Power Query data types

Being the kind of person who stores their data in Power BI I didn’t do much with Power Query data types when they were released; after all, you can create Organisation data types to access Power BI data from Excel and I prefer using Excel cube functions anyway. However if you’re not using Power BI then I can see how Power Query data types could be really useful for building reports that go beyond big, boring tables, making it much easier to create more complex report layouts.

Power Query connector for Power BI dataflows and Dataverse

Lastly, the feature I’m most excited about: the ability to load data from Power BI dataflows and Dataverse into Excel via Power Query. It’s not available yet although I promise it’s coming very soon! The ability to share cleaned and conformed data via dataflows direct to those Excel users who just want a data dump (rather than using Analyze in Excel on a Power BI dataset) will prove to be extremely popular, I think. There are a lot of improvements to dataflows coming soon too (you do remember to check the release notes regularly, don’t you?).

Conclusion

Overall it’s clear that Excel Power Query is getting better and better. It may never be able to keep pace with Power BI (what can?) but all these new features show that, for people who prefer to do everything in Excel, it’s making Excel a much better place to build reports. I feel like I need to update my SQLBits presentation now!

Understanding The “We Couldn’t Fold The Expression To The Data Source” Error In Power BI

If you’re using DirectQuery mode in Power BI you may occasionally run into the following error message:

Couldn’t load the data for this visual

OLE DB or ODBC error: [Expression.Error] We couldn’t fold the expression to the data source. Please try a simpler expression..

What does it mean and how can you fix it?

To understand what’s going on here you must first understand what query folding is. There’s some great documentation here that I strongly recommend you read, but in a nutshell query folding refers to how the Power Query engine inside Power BI can push calculation and transformation logic back to whatever data source you’re using in the form of a query – for example a SQL query if your data source is a relational database. Most of the time when people talk about query folding they are using Import mode but it’s even more important in DirectQuery mode: in DirectQuery mode not only does every transformation you create in the Power Query Editor have to fold, but every DAX query (including all your DAX calculations) generated by the visuals on your report has to be folded into one or more queries against your data source too.

You can do some pretty complex things in the Power Query Editor and in DAX and the error message above is the error you get when Power BI admits defeat and says it can’t translate a DAX query generated by a visual on a report into a query against your data source. The cause is likely to be a combination of several of the of the following:

  • A complex data model
  • Complex DAX used in measures or calculated columns
  • The use of dynamic M parameters
  • Complex transformations created in the Power Query Editor

Unfortunately it’s hard to be more specific because Power BI can fold different transformations to different data sources and this error almost never occurs in simple scenarios.

How can you avoid it? Again, I can only offer general advice:

  • Don’t do any transformations in the Power Query Editor if you’re using DirectQuery mode. If you want to use DirectQuery you should always make sure your data is modelled appropriately in whatever data source you’re using before you start designing your dataset in Power BI.
  • Keep your data model as simple as possible. For example, avoiding bi-directional relationships is a good idea.
  • Try to implement as much of the logic for your calculations in your data source and reduce the amount of DAX you need to write.
  • Try to write your DAX in a different way in the hope that Power BI will be able to fold it.

The Power Query XML Functions, Missing Or Null Values And Inconsistent Schema Inference

A few months ago one of my colleagues at Microsoft, David Browne, showed me an interesting Power Query problem with how the Xml.Tables and Xml.Document M functions handle null or missing values. I’m posting the details here because the problem seems fairly common, it causes a lot of confusion and it’s not easy to deal with.

In XML there are two ways to represent a null or missing value:<a></a> or omitting the element completely. Unfortunately the Xml.Tables and Xml.Document M functions handle these inconsistently: they treat the <a></a> form as a table but the other as a scalar.

For example consider the following M query that takes some XML (with no missing values), passes it to Xml.Tables and then expands the columns:

let
    Source = "<?xml version=""1.0"" encoding=""UTF-8""?>
    <Record>
     <Location>
        <id>123</id>
        <code>abc</code>
     </Location>
     <Location>
        <id>123</id>
        <code>hello</code>     
     </Location>
    <Location>
        <id>123</id>
        <code>124</code>     
     </Location>
</Record>",
   t0 = Xml.Tables(Source),
    #"Expanded Table" = Table.ExpandTableColumn(t0, "Table", {"id", "code"}, {"id", "code"})
   
in
   #"Expanded Table"

Here’s the output:

Notice how the code column contains scalar values.

Now compare this with a query where the XML contains a missing value using the <a></a> form but where the query is otherwise identical:

let
    Source = "<?xml version=""1.0"" encoding=""UTF-8""?>
    <Record>
     <Location>
        <id>123</id>
        <code>abc</code>
     </Location>
     <Location>
        <id>123</id>
        <code></code>     
     </Location>
    <Location>
        <id>123</id>
        <code>124</code>     
     </Location>
</Record>",
   t0 = Xml.Tables(Source),
    #"Expanded Table" = Table.ExpandTableColumn(t0, "Table", {"id", "code"}, {"id", "code"})
   
in
   #"Expanded Table"

Here’s the output:

Notice how the code column now contains values of type table because of the presence of a missing value.

Finally, here’s one last example where the missing value is handled by omitting the element:

let
    Source = "<?xml version=""1.0"" encoding=""UTF-8""?>
    <Record>
     <Location>
        <id>123</id>
        <code>abc</code>
     </Location>
     <Location>
        <id>123</id> 
     </Location>
    <Location>
        <id>123</id>
        <code>124</code>     
     </Location>
</Record>",
   t0 = Xml.Tables(Source),
    #"Expanded Table" = Table.ExpandTableColumn(t0, "Table", {"id", "code"}, {"id", "code"})
   
in
   #"Expanded Table"

Here’s the output:

Notice how code now contains scalar values but the value in that column on the second row is null.

You can probably guess the kind of problems this can cause: if your source XML sometimes contains missing values and sometimes doesn’t you can end up with errors if you try to expand a column that sometimes contains table values and sometimes doesn’t. It’s not a bug but sadly this is just one of those things which is now very difficult for the Power Query team to change.

There’s no easy way to work around this unless you can change the way your data source generates its XML, which is unlikely. Probably the best thing you can do is use the Xml.Document function; it has the same problem but it’s slightly easier to deal with given how it returns values in a single column, which means you can use a custom column to trap table values something like this:

 let
  Source
    = "<?xml version=""1.0"" encoding=""UTF-8""?>
    <Record>
     <Location>
        <id>123</id>
        <code>abc</code>
     </Location>
     <Location>
        <id>123</id> 
        <code></code>
     </Location>
    <Location>
        <id>123</id>
        <code>124</code>     
     </Location>
</Record>",
  t0 = Xml.Document(Source),
  Value = t0{0}[Value],
  #"Expanded Value"
    = Table.ExpandTableColumn(
    Value,
    "Value",
    {
      "Name",
      "Namespace",
      "Value",
      "Attributes"
    },
    {
      "Name.1",
      "Namespace.1",
      "Value.1",
      "Attributes.1"
    }
  ),
  #"Removed Other Columns"
    = Table.SelectColumns(
    #"Expanded Value",
    {"Name.1", "Value.1"}
  ),
  #"Added Custom" = Table.AddColumn(
    #"Removed Other Columns",
    "Custom",
    each
      if Value.Is([Value.1], type table) then
        null
      else
        [Value.1]
  )
in
  #"Added Custom"

[Thanks to David Browne for the examples and solutions, and to Curt Hagenlocher for confirming this is the way the Xml.Tables and Xml.Document functions are meant to behave]

Understanding Power Query Query Diagnostics Data With The Diagnostics.ActivityID M Function

I was looking at the output of Power Query’s Query Diagnostics feature recently (again) and trying to understand it better. One of the more confusing aspects of it is the way that the Power Query engine may evaluate a query more than once during a single refresh. This is documented in the note halfway down this page, which says:

Power Query might perform evaluations that you may not have directly triggered. Some of these evaluations are performed in order to retrieve metadata so we can best optimize our queries or to provide a better user experience (such as retrieving the list of distinct values within a column that are displayed in the Filter Rows experience). Others might be related to how a connector handles parallel evaluations.

I came up with the following M query to illustrate this:

#table(
    type table 
    [#"Activity ID"=text], 
    {{Diagnostics.ActivityId()}}
    )

If you paste this code into a new blank query:

…you have a query that returns a table containing a single cell containing the text value returned by the Diagnostics.ActivityId M function, which I blogged about here. The output – copied from the Data pane of the main Power BI window – looks like this:

The Diagnostics.ActivityId function is interesting because it returns an identifier for the currently-running query evaluation, so in the table above the value in the Activity ID column is the identifier for the query that returned that table.

If you run a Query Diagnostics trace when refreshing this query, you’ll see that the Activity Id column of the Diagnostics_Detailed trace query contains evaluation identifier values:

The following query takes the output of a Diagnostics_Detailed trace query and gets just the unique values from the Id and Activity Id columns:

let
    Source = #"Diagnostics_Detailed_2022-04-24_19:40",
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Id", "Activity Id"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Other Columns")
in
    #"Removed Duplicates"

This makes it easy to see that my query was actually evaluated (or at least partially evaluated) three times when I clicked refresh. Since the value in the Activity Id column for Id 4.10 matches the value in the table loaded into my dataset, I know that that was the evaluation that loaded my table into the dataset.

Filtering An Excel Cube Function Report By A List Of Manually-Entered Values

In Power BI there’s a popular custom visual called “Filter by list” that lets you filter a Power BI report by any list of values that you paste into it. It can save you a lot of time in some scenarios, for example if you need to copy a list of values from another application and select those values in a slicer. In this post I’ll show how to recreate the same functionality in an Excel report connected to Power BI, Analysis Services or the Excel Data Model/Power Pivot using cube functions and dynamic arrays.

To show how I’m going to use a super-simple model built using Power Pivot consisting of the following single table:

The only other thing to note about the model is that it contains a measure called Sales Amount that sums up the values in the Sales column:

Sales Amount:=SUM(Sales[Sales])

Here’s what a PivotTable connected to this model looks like:

The aim here is to recreate this PivotTable using cube functions and allow the user to enter the list of invoice numbers used to slice the data either manually or by copy-and-pasting them into a table.

The first step is to create an Excel table (which I’ve called InvoiceNumbers) to hold the invoice numbers the user enters:

The next thing to do is to generate the text of the MDX set expression representing the list of invoice numbers in this table, which I’ve put in cell D2:

="{" & TEXTJOIN(",",TRUE, "[Sales].[Invoice Number].[Invoice Number].&[" & InvoiceNumbers & "]" ) &"}"

This text is used to create two named sets using the CUBESET function. The first, which I’ve put in cell D3, simply returns the set of invoice numbers that you get from evaluating the above MDX expression:

=CUBESET("ThisWorkbookDataModel", $D$2, "Invoice Numbers")

The second named set, in D4, is more complicated: it returns the set of customers that have sales for these invoice numbers.

=CUBESET(
"ThisWorkbookDataModel", 
"NONEMPTY( [Sales].[Customer].[Customer].MEMBERS, {[Measures].[Sales Amount]} * " & $D$2 & ")",
"Customers")

Last of all are the cube functions that display the report itself. In cell E6 is the CUBEVALUE function returning the measure Sales Amount:

=CUBEMEMBER("ThisWorkbookDataModel", "[Measures].[Sales Amount]")

In D7 is the formula (using the technique I blogged about here) to get the list of Customers returned by the second named set above:

=MAKEARRAY(
CUBESETCOUNT($D$4), 
1, 
LAMBDA(r,c, CUBERANKEDMEMBER("ThisWorkbookDataModel", $D$4, r))
)

Finally, in D8, is the expression that gets the Sales Amount values for each customer, sliced also by the set of selected invoice numbers:

=MAKEARRAY(
CUBESETCOUNT($D$4),
1,
LAMBDA(r,c,
CUBEVALUE("ThisWorkbookDataModel", INDEX($D$7#,r), $D$3, $E$6))
)

Here are the formulas all together:

And here it all is working:

One last point: to keep things simple I’ve not included any error handling, which means that if a user enters a blank value or a value that isn’t an invoice number in the table the whole thing will break. To handle errors using the technique I blogged about here, alter the formula in D2 to:

="{"&
TEXTJOIN(
",",
TRUE,
LET(
MemberExpression,
"[Sales].[Invoice Number].[Invoice Number].&["&InvoiceNumbers&"]",
"IIF(ISERROR(STRTOMEMBER("""&MemberExpression&""")), {}, STRTOMEMBER("""&MemberExpression&"""))")
) &"}"

You can download the example workbook here (although it may not work unless you’ve got a version of Excel with dynamic arrays enabled).

New Book: “Microsoft Power BI Performance Best Practices” By Bhavik Merchant

I’ve said it before but I’ll say it again: I don’t publish book reviews here on my blog but I’m always happy to promote new Power BI books when they are published in return for a free copy.

Recently a friend and ex-Microsoft colleague of mine, Bhavik Merchant, published a book called “Microsoft Power BI Performance Best Practices” which I wrote the foreword for and I think (although of course I’m biased) it’s a good one. It’s about tuning all aspects of Power BI report and refresh performance, including DAX, data modelling, gateway configuration, Power Query/M and report design; it also covers the use of external tools like DAX Studio and Tabular Editor. From a purely technical point of view it gathers together a lot of useful information that is otherwise scattered across various documentation articles, blog posts, conference presentations and white papers; what I particularly liked, though, is the emphasis on methodology and how you should think about approaching performance tuning. If you’re new to Power BI this is a great resource but even experienced Power BI developers and consultants will learn something from it.

You can buy the book from Amazon UK here.

Forecasting With Power BI Data In Excel Using Cube Functions And The FORECAST.ETS Function, Part 2

In the first post in this series I showed a simple example of how you can use the FORECAST.ETS function in Excel in combination with the Excel cube functions to do forecasting with Power BI data. In this post I’ll show you how you can:

  1. Make the range of data that you display from Power BI, and pass into the FORECAST.ETS function, dynamic and controllable from a slicer
  2. Make the number of periods that you forecast for dynamic too
  3. Put both the actuals and forecast data together in a single range and display that in a chart

The first problem, making the range of data returned from Power BI via cube functions dynamic, is reasonably straightforward because it’s a variation on a technique I blogged about last year here. A slicer can be used to select the date range, which in turn can be captured using the CUBESET function, and finally the MAKEARRAY function can be used to return a dynamic array of dates and associated measure values. Here it is working:

Cell B2 contains the CUBESET formula that is used to capture the set of selected items in the slicer (which is called Slicer_Date):

=CUBESET("Price Paid", Slicer_Date, "Date Range Set")

B5 contains the dynamic array formula that returns the dates selected in the slicer using the CUBERANKEDMEMBER function:

=MAKEARRAY(
CUBESETCOUNT($B$2), 
1, 
LAMBDA(r,c,CUBERANKEDMEMBER("Price Paid",$B$2,r))
)

C5 contains the dynamic array formula that returns the values for the Count of Sales measure for the date range in B5:

=MAKEARRAY(
CUBESETCOUNT($B$2),
1,
LAMBDA(r,c,
LET(
CubeValueResult,
CUBEVALUE("Price Paid",$C$4,INDEX($B$5#,r)),
IF(ISNUMBER(CubeValueResult), CubeValueResult,0))
))

The second problem is how to create a similar dynamic range of forecast dates and values. Here’s the solution working:

J3 contains the number of days to forecast. F5 contains a formula that returns a list of dates whose length is controlled by the value in J3, and which starts the day after the last day in the range returned by the formula in B5. Here’s the formula in F5:

=SEQUENCE($J$2)+MAX(DATEVALUE($B$5#))

The formula in G5 returns the forecast values for the date range returned by F5, based on the values returned by the formulas in B5 and C5:

=INT(FORECAST.ETS($F$5#, VALUE($C$5#), DATEVALUE($B$5#),7))

The third and final problem is how to combine these two ranges into a single range, like so:

The key to appending the Forecast values underneath the Count Of Sales values is the new VSTACK Excel function. So, for example, in I5 the following formula returns a dynamic array combining the dates used by the two ranges created above:

=VSTACK($B$5#, $F$5#)

For the Count Of Sales and Forecast columns I have padded the data out with zeroes, so for example the Count Of Sales column shows zeroes for the dates that contain forecast values and the Forecast column contains zeroes for the dates that contain Count Of Sales data. I did this by using VSTACK and appending/pre-pending an array containing zeroes created using MAKEARRAY. Here’s the formula for J5, ie the data in the Count Of Sales column:

=VSTACK($C$5#, MAKEARRAY($J$2, 1,LAMBDA(r,c,0)))

Here’s the formula for K5, ie the data in the Forecast column:

=VSTACK(MAKEARRAY(CUBESETCOUNT($B$2), 1,LAMBDA(r,c,0)), $G$5#)

I could have used the HSTACK function to combine these three dynamic arrays into a single array but there’s no real benefit to doing this, and not doing it makes it easy to use the technique Jon Peltier describes here to display dynamic arrays in a chart. I won’t repeat what he says but you need to create Names for these last three dynamic arrays in order to be able to use them in a chart.

One last thing: I haven’t said anything about how to make sure the forecast values are useful and accurate. That’s because I’m not a data scientist and I don’t have any good advice to share. This is a very important topic, though, and I’m very grateful to Sandeep Pawar for providing some tips on Twitter here.

Forecasting With Power BI Data In Excel Using Cube Functions And The FORECAST.ETS Function, Part 1

One of the reasons I love using the Excel cube functions to get Power BI data into Excel instead of PivotTables is because they make it much easier to use other native Excel features and functions in combination with your data. One Excel feature I’ve always been curious about is the FORECAST.ETS function, which allows you to forecast values from time series data. Here’s what the docs have to say about this function:

[This function] calculates or predicts a future value based on existing (historical) values by using the AAA version of the Exponential Smoothing (ETS) algorithm. The predicted value is a continuation of the historical values in the specified target date, which should be a continuation of the timeline. You can use this function to predict future sales, inventory requirements, or consumer trends.

Recently I started playing around with this function to see how it could be used with cube functions and since I learned so many interesting things I thought it would make a good series of blog posts. In this post I’ll look at a basic scenario and in my next post I’ll show how to use dynamic arrays (including one of the newly-released functions) to build something more complex. For these posts I’ll use a dataset containing Land Registry Price Paid data as my source; the exact contents don’t matter much but it contains a Date dimension table, a fact table containing one row for each property (or “real estate” for you Americans) sale in England and Wales in 2018 and a measure called Count Of Sales that returns the aggregated number of property sales.

Let’s start with a worksheet containing cube functions that show the Count of Sales measure broken down by date for the first seven days of 2018:

Here are the underlying formulas:

How can we use FORECAST.ETS on this data? Well, to keep things super-simple I manually entered the next three dates (ie the 8th, 9th and 10th of January) in cells B10, B11 and B12 like so:

The question is, how can we add forecast values to the range C10:C12? The first thing I tried was adding the following formula to C10:

=FORECAST.ETS($B10, $C$3:$C$9, $B$3:$B$9)

It didn’t work though…

The problem here is that the dates don’t come through from Power BI as Excel dates and the values don’t come through from Power BI as numeric values. This can be fixed by using the VALUE and DATEVALUE Excel functions like so:

=FORECAST.ETS($B10, VALUE($C$3:$C$9), DATEVALUE($B$3:$B$9))

Which can then be dragged down to C11 and C12 too:

We now have something working but there are still a few more problems to solve:

  • The forecasts come back as decimal values, which don’t make sense for this measure. This can be fixed by the use of the INT function, for example, to round the values to integers.
  • From looking at the data I can see there is a clear weekly pattern in the sales – very few properties are sold at weekends, whereas Fridays are the most popular day to sell – and this can be added as a hint to the function by setting the Seasonality argument of FORECAST.ETS to 7.
  • Depending on how you slice the data, some dates may return a blank value for Count Of Sales and if there are too many then FORECAST.ETS may not be able to return a value and return a #VALUE error instead. I had hoped setting the Data Completion argument of FORECAST.ETS to 1 would fix this but it didn’t; one solution is to alter the formulas in C3:C9 to return 0 when CUBEVALUE returns a blank value. The standard way to do this is to use ISNUMBER function as detailed on this thread, but I realised this was a great opportunity to use the new(ish) LET function to do things more elegantly.

Here’s the final version of the forecast formula in C10 with these changes:

=INT(FORECAST.ETS($B10, VALUE($C$3:$C$9), DATEVALUE($B$3:$B$9),7))

Here’s the new version of the formula in C3 as an example of this (with a newly added slicer reference too):

=LET(
CubeValueResult,
CUBEVALUE("Price Paid",$B3,C$2, Slicer_Property_Transactions1),
IF(ISNUMBER(CubeValueResult), CubeValueResult, 0)
)

That’s the simple scenario working then. But wouldn’t it be cool if we could vary the number of days of source data or the number of days forecast and put everything together into one big dynamic array so it can be charted easily? I’ll show you how in my next post!

%d bloggers like this: