One of the reasons I love using the Excel cube functions to get Power BI data into Excel instead of PivotTables is because they make it much easier to use other native Excel features and functions in combination with your data. One Excel feature I’ve always been curious about is the FORECAST.ETS function, which allows you to forecast values from time series data. Here’s what the docs have to say about this function:
[This function] calculates or predicts a future value based on existing (historical) values by using the AAA version of the Exponential Smoothing (ETS) algorithm. The predicted value is a continuation of the historical values in the specified target date, which should be a continuation of the timeline. You can use this function to predict future sales, inventory requirements, or consumer trends.
Recently I started playing around with this function to see how it could be used with cube functions and since I learned so many interesting things I thought it would make a good series of blog posts. In this post I’ll look at a basic scenario and in my next post I’ll show how to use dynamic arrays (including one of the newly-released functions) to build something more complex. For these posts I’ll use a dataset containing Land Registry Price Paid data as my source; the exact contents don’t matter much but it contains a Date dimension table, a fact table containing one row for each property (or “real estate” for you Americans) sale in England and Wales in 2018 and a measure called Count Of Sales that returns the aggregated number of property sales.
Let’s start with a worksheet containing cube functions that show the Count of Sales measure broken down by date for the first seven days of 2018:
Here are the underlying formulas:
How can we use FORECAST.ETS on this data? Well, to keep things super-simple I manually entered the next three dates (ie the 8th, 9th and 10th of January) in cells B10, B11 and B12 like so:
The question is, how can we add forecast values to the range C10:C12? The first thing I tried was adding the following formula to C10:
=FORECAST.ETS($B10, $C$3:$C$9, $B$3:$B$9)
It didn’t work though…
The problem here is that the dates don’t come through from Power BI as Excel dates and the values don’t come through from Power BI as numeric values. This can be fixed by using the VALUE and DATEVALUE Excel functions like so:
=FORECAST.ETS($B10, VALUE($C$3:$C$9), DATEVALUE($B$3:$B$9))
Which can then be dragged down to C11 and C12 too:
We now have something working but there are still a few more problems to solve:
- The forecasts come back as decimal values, which don’t make sense for this measure. This can be fixed by the use of the INT function, for example, to round the values to integers.
- From looking at the data I can see there is a clear weekly pattern in the sales – very few properties are sold at weekends, whereas Fridays are the most popular day to sell – and this can be added as a hint to the function by setting the Seasonality argument of FORECAST.ETS to 7.
- Depending on how you slice the data, some dates may return a blank value for Count Of Sales and if there are too many then FORECAST.ETS may not be able to return a value and return a #VALUE error instead. I had hoped setting the Data Completion argument of FORECAST.ETS to 1 would fix this but it didn’t; one solution is to alter the formulas in C3:C9 to return 0 when CUBEVALUE returns a blank value. The standard way to do this is to use ISNUMBER function as detailed on this thread, but I realised this was a great opportunity to use the new(ish) LET function to do things more elegantly.
Here’s the final version of the forecast formula in C10 with these changes:
=INT(FORECAST.ETS($B10, VALUE($C$3:$C$9), DATEVALUE($B$3:$B$9),7))
Here’s the new version of the formula in C3 as an example of this (with a newly added slicer reference too):
=LET( CubeValueResult, CUBEVALUE("Price Paid",$B3,C$2, Slicer_Property_Transactions1), IF(ISNUMBER(CubeValueResult), CubeValueResult, 0) )
That’s the simple scenario working then. But wouldn’t it be cool if we could vary the number of days of source data or the number of days forecast and put everything together into one big dynamic array so it can be charted easily? I’ll show you how in my next post!