DAX Median() Function Does Not Work On Tables With More Than 2 Billion Rows

An interesting – if obscure – fact I learned recently is that a small number of DAX functions such as Median() do not work on tables with more than 2 billion rows in Analysis Services Tabular, Azure AS and Power BI.

It’s quite easy to reproduce in Power BI. The following M expression returns a table with two billion and four rows:

let
    Source = 
    List.Repeat(
        {1,2,3,4},
        500000001
        ),
    #"Converted to Table" = 
    Table.FromList(
        Source, 
        Splitter.SplitByNothing(), 
        null, 
        null, 
        ExtraValues.Error
        ),
    #"Changed Type" = 
    Table.TransformColumnTypes(
        #"Converted to Table",
        {{"Column1", Int64.Type}}
        )
in
    #"Changed Type"

It takes some time to load this table  – around twenty minutes – but because there are only four distinct values in the table the resulting .pbix file is only 31KB thanks to the way Power BI compresses data.

If you load this table into your dataset, call it VeryBigTable and create the following measure:

Median Test = MEDIAN(VeryBigTable[Column1])

…and use the measure in a visual, you’ll see the following error:

image

The current query cannot be evaluated for the ‘VeryBigTable (42)’ table, because the table contains more than two billion rows.

What’s more, the error will always occur even if you apply a filter to the table that returns less than two billion rows. The same problem occurs with some other functions, such as Percentile(), but it’s worth pointing out that the vast majority of DAX functions work as normal with tables with more than two billion rows – for example, in the pbix file used here the Sum() and CountRows() functions not only work fine but return instantly.

Luckily, in the case of the Median() function, there is an easy workaround because you can calculate a median in other ways such as the one described on the DAX Patterns site here. The code is a lot more verbose but it works on a 2 billion+ row table.

image

9 responses

  1. Pingback: DAX Median & 2 Billion+ Rows – Curated SQL

    • You could create a calculated column and then use SUM? I know that a calculated column isn’t going to work in all cases, and even if it does there are massive issues with creating a calculated column like this, but it’s the only workaround I can think of without knowing more about the calculation you’re trying to do.

      • EVALUATE(
        CALCULATETABLE(‘Bets’, ‘Game'[Game Category]=”Cash Games”)
        )

        doesn’t work either, this is starting to be serious problem, next on the list to try is LOOOKUPVALUE, but this is forcingme to change the model structure. will give you feedback when tested.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: