The Pros And Cons Of Modelling Measures As A Dimension In Power BI

You probably know that dimensional modelling and building a star schema are very important in Power BI. In a star schema you’ll have at least one fact table, and in your fact table you’ll have two main types of column: dimension keys, which link the fact table to dimension tables, and measures, which store the data you want to aggregate and analyse. Here’s a simple example of a fact table called Sales with three dimension key columns (Product, Store and Customer) and three measure columns (Sales, Tax and Volume Sold):

Quite often, though, I see people taking a slightly different approach to modelling their fact tables: instead of having separate columns for each measure they unpivot their data, create one row in their fact table for each measure value, use a single column to store all the measure values and create a new dimension to allow the user to select which measure values they want. Here’s an example of another fact table, called Sales Unpivot, showing how the data from the Sales fact table above can be remodelled using this technique:

In this fact table the dimension keys remain the same, but the Value column stores all the data from the Sales, Tax and Volume Sold measures in the original table and the Measure Name column tells you what type of measure value is stored on any given row. Let’s call this approach the Measures Dimension approach.

There are some advantages to building fact tables using the Measures Dimension approach, for example:

  • You can now use a slicer in a report to select the measures that appear in a visual
  • You can now easily add new measures without having to add new columns in your fact table
  • You can use row-level security to control which measures a user has access to

Generally speaking, though, any time you deviate from a conventional dimensional model you risk running into problems later on and this is no exception. Let’s go through the disadvantages of modelling data using a Measures Dimension.

Formatting

Notice that the Sales and Tax measure columns from the Sales fact table are currency values and that Volumn Sold contains decimal values with four decimal places. It’s quite easy to set different formats for different measures when each measure is a separate column:

However, when all your values are stored in one column, as in the Measures Dimension example, formatting is not so straightforward. You might be able to get away with using one generic format for all your data:

…but that isn’t ideal. Of course you can create DAX measures and format them appropriately but then you lose some of the flexibility of this approach; you could also use a calculation group and dynamic format strings as Kasper describes here.

Compression

More seriously, Power BI does a much more efficient job of storing and compressing the data in a conventional fact table compared to when the Measures Dimension approach is used and this has consequences for query performance. Using the View Metrics button in DAX Studio to see the details of how the data is stored for each table is revealing. Here are some selected highlights:

First of all, notice that the Sales Unpivot table (which uses the Measures dimension approach) is 66% larger than the Sales table. Notice also that in the Sales table the Sales and Tax measure columns, which contain currency values, can use the Currency data type (which shows up Decimal here, confusingly) which in turn means that they can use Value encoding; only the Volume Sold column needs to be stored using the Decimal Number data type (which shows up as Double here), and must use Hash encoding. In the Sales Unpivot table, since all the measure values are stored in the Value column, this column has to use the Decimal Number data type and Hash encoding. As this article explains (the Definitive Guide To DAX goes into a lot more detail) Value encoding can give you a lot of performance benefits.

Calculation Complexity

When you start to build more complex DAX calculations then the disadvantages of the Measures Dimension approach become even more apparent. Let’s say you want a visual in your report that shows Sales, Tax and a measure that subtracts Tax from Sales called Sales After Tax:

Here’s the DAX needed for this visual:

Sales Measure = SUM('Sales'[Sales])
Tax Measure = SUM('Sales'[Tax])
Sales After Tax = [Sales Measure] - [Tax Measure]

To achieve the same result with the Measures Dimension approach, though, you need to know how to use the DAX Calculate() function, something like this:

Sales Measure 2 =
CALCULATE (
    SUM ( 'Sales Unpivot'[Value] ),
    KEEPFILTERS ( 'Sales Unpivot'[Measure Name] = "Sales" )
)
Tax Measure 2 =
CALCULATE (
    SUM ( 'Sales Unpivot'[Value] ),
    KEEPFILTERS ( 'Sales Unpivot'[Measure Name] = "Tax" )
)
Sales After Tax 2 = [Sales Measure 2] - [Tax Measure 2]

[Note that in most cases I’d create a separate dimension table for the Measures dimension, but to keep things simple here I’ve not done that]

If you expect other people to build measures on your dataset then this additional complexity can be a significant barrier to overcome. Calculate isn’t an easy function to use properly.

Calculation Performance

Last of all, there’s also also a performance penalty to pay with the Measures dimension. Taking the Sales After Tax example from the previous section, here’s what the Server Timings tab in DAX Studio shows for the query associated with the visual showing Sales, Tax and Sales After Tax:

Notice that there’s just one Storage Engine query: DAX fusion has kicked in so that the Sales and Tax values required can be retrieved in the same scan.

However, here’s what the Server Timings tab shows for the same visual using the Measures Dimension approach and the second set of measures using the Sales Unpivot table shown above:

Not only is this query slower but there are now two Storage Engine queries: one to get the Sales data and one to get the Tax data. Since separate scans are needed to get each measure value, the more measures you have in a visual or the more measures needed by your calculations, the more scans are needed. This can very quickly add up to a big performance problem, especially if each scan is relatively slow – which is more likely to be the case since the Measures Dimension approach means Power BI is less able to compress data effectively.

Conclusion

As you’ve probably guessed by now I’m not a big fan of the Measures Dimension approach. While there are definitely some advantages to using it I think the disadvantages – which aren’t always immediately obvious – outweigh them.

25 responses

  1. Very interesting Chris. I will need to explore this more. Out of interest, why can’t the measure dim value column be formatted as currency from PQ?

  2. Ahh, I just re-looked and notice your volume has a large number of decimals, too many for currency. I guess this was to illustrate a point. But I can also see that volume is sometimes (often even) whole numbers and/or less than 5 decimals, so in that case presumably currency would be fine.

    • Yes, if all your measures are Currency then the data type isn’t a problem, but you still might have an issue with the ranges of values within each measure.

  3. Good article. I think what is missing though is the business requirements. It is the business requirements that generally lead you down one data model path or another. Sure, it is fair to say that there are disadvantages and complexities with Measure Dimensions but this article fails to demonstrate how a conventional data model can meet the stated business requirements of why someone might want to consider Measure Dimensions or the complexity that would be introduced by trying to meet those requirements with a more traditional data model. At the end of the day, this takes too much of a stance that “I don’t care what the business wants, I’m doing it my way” attitude and, quite honestly, it was because of this kind of attitude that self service BI exists in the first place.

    • Obviously, I’m going to respectfully disagree with you here. While I understand that there are lots of requirements that might lead you down this path, my point is that using this approach can stop you satisfying other requirements – such as the often unarticulated, but still incredibly important, requirement that the report runs quickly. There are always trade-offs between different requirements to be made and you should understand what they are before you model your data. I’ve seen far too many cases where people have used some clever modelling to meet a requirement (see also https://www.kasperonbi.com/dynamically-switching-axis-on-visuals-with-power-bi/) and not realised how much of a performance penalty there is to pay. Besides that, it’s also easy for both business users and report developers to confuse functional requirements (“I must be able to switch the measures shown in this visual using a slicer”) with business requirements (“I need to understand how these factors affect my profitability”). I would love it if Power BI could meet all these requirements without any compromises being necessary and hopefully one day it will be able to.

      • I guess what I was saying is that the article would be more complete if, for example, the subject of RLS by measure comes up. In a standard data model you could fulfill this requirement but you would need to create a separate table for these column level security columns and create a unique key between them (related). Then you can implement RLS on this table that is separate from the main table. This adds complexity to the model obviously and there are potential issues with double data load (can get around that using query references), etc. So, only pointing out the complexity on the one side and not the complexity on the other side makes things very one-sided and not a fair comparison between the approaches.

      • I do actually call out RLS by measure in the post as an advantage of the Measures Dimension approach, although admittedly I don’t go into any detail on it. This is because I’ve had to deal with a lot of cases over the last few months where customers had not known about the drawbacks of using a Measures Dimension until it was too late, and I want to make sure people only use this approach if they really have to.

  4. The problem comes when you have not just Sales + Tax, but Sales revenue, COGS, Gross profit, etc all the other rows in the ‘Income statement’ for each Main-product/service and Then comes the Volume and all other things from the ‘Operational Statement’.
    It’s good to mention it as pros and cons since – to my knowledge – there is no International Standard – how to deal with this … or it would be in Oracle’s Dimensional model (HFM-cube) – that we use.

  5. Pingback: The Downside of EAV-Style Measures in Power BI – Curated SQL

  6. Totally agree Chris and thanks for sharing. It’s so easy to think this would be smart if someone haven’t told us not to do it 🙂
    And the business requirements could easily be achieved by instead using a decoupled dimension listing the measures and a switch function that chooses the wanted columns. Preferably the columns should anyway be “hidden” to the user and instead have explicit measures.

  7. Would “Modelling Measures As A Dimension” be a solution if you had 50 or more different reports (different tables in database), with different headers.
    Then one could unpivot every table and append them all into one big table, and use Modelling Measures As A Dimension?

  8. @Chris – Not Sure Why KEEPFILTERS is required.

    Can we not write
    mSales = CALCULATE([mValue], Sales[Attribute]=”Sales”)
    Where mValue = Sum(Sales[Value])

    Cheers
    Sam

  9. Matt – Great article.

    It touched on something I am dealing with now, and was wondering what my best approach might be. I was wondering if you gave any thought to separate tables, rather than a “Dimension Value” Column.

    In our case, we have text files as source data, not the best solution, but it’s what we have… and there are 7 key dimensions (Quantity, Labor, Equipment, Material, Subcontracts, Supplies, Contract) and in each dimension, there are multiple measures that need to be created (Original Budget, Current Budget, Actual Hours, Actual Cost, Earned Hours, Earned Cost, Forecast Hours, Forecast Cost, and a multitude of “variance” measures. BTW… Only the “Labor” dimension has hours.)

    My thought was a fact table for each of the 7 key dimensions all connected to dimension tables, so (hopefully) it would be transparent to the user community.

    Thoughts?

  10. I recently joined a team that uses a PBI model to hold what is, in effect, a data warehouse (lots of disparate data and indicators about a common set of organisations). They’ve taken a “Measures Dimension” approach – they then use CALCULATE functions to pick out specific data items to make measures (which I think get’s them around your “formatting” con?). They’ve put a lot of work into their model, but it was one by someone with a background in databases rather than PBI and I think their attitude was “let’s just load all the data and sort all the problems out with DAX”. I agree this is a recipe for confusion and a pretty poor platform on which to do any decent analytics.

    Your post will be much help in persuading them of their mistake (I hope!). I’d suggest adding a further ‘con’: producing visualisations will be really tricky e.g. doing an x/y scatter – as you’d need to filter for one measure on one axis and the other measure on the other one. I’d go as far as to say their approach is only really viable if they only ever want to look at one variable at a time?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: