The Pros And Cons Of Modelling Measures As A Dimension In Power BI

You probably know that dimensional modelling and building a star schema are very important in Power BI. In a star schema you’ll have at least one fact table, and in your fact table you’ll have two main types of column: dimension keys, which link the fact table to dimension tables, and measures, which store the data you want to aggregate and analyse. Here’s a simple example of a fact table called Sales with three dimension key columns (Product, Store and Customer) and three measure columns (Sales, Tax and Volume Sold):

Quite often, though, I see people taking a slightly different approach to modelling their fact tables: instead of having separate columns for each measure they unpivot their data, create one row in their fact table for each measure value, use a single column to store all the measure values and create a new dimension to allow the user to select which measure values they want. Here’s an example of another fact table, called Sales Unpivot, showing how the data from the Sales fact table above can be remodelled using this technique:

In this fact table the dimension keys remain the same, but the Value column stores all the data from the Sales, Tax and Volume Sold measures in the original table and the Measure Name column tells you what type of measure value is stored on any given row. Let’s call this approach the Measures Dimension approach.

There are some advantages to building fact tables using the Measures Dimension approach, for example:

  • You can now use a slicer in a report to select the measures that appear in a visual
  • You can now easily add new measures without having to add new columns in your fact table
  • You can use row-level security to control which measures a user has access to

Generally speaking, though, any time you deviate from a conventional dimensional model you risk running into problems later on and this is no exception. Let’s go through the disadvantages of modelling data using a Measures Dimension.

Formatting

Notice that the Sales and Tax measure columns from the Sales fact table are currency values and that Volumn Sold contains decimal values with four decimal places. It’s quite easy to set different formats for different measures when each measure is a separate column:

However, when all your values are stored in one column, as in the Measures Dimension example, formatting is not so straightforward. You might be able to get away with using one generic format for all your data:

…but that isn’t ideal. Of course you can create DAX measures and format them appropriately but then you lose some of the flexibility of this approach; you could also use a calculation group and dynamic format strings as Kasper describes here.

Compression

More seriously, Power BI does a much more efficient job of storing and compressing the data in a conventional fact table compared to when the Measures Dimension approach is used and this has consequences for query performance. Using the View Metrics button in DAX Studio to see the details of how the data is stored for each table is revealing. Here are some selected highlights:

First of all, notice that the Sales Unpivot table (which uses the Measures dimension approach) is 66% larger than the Sales table. Notice also that in the Sales table the Sales and Tax measure columns, which contain currency values, can use the Currency data type (which shows up Decimal here, confusingly) which in turn means that they can use Value encoding; only the Volume Sold column needs to be stored using the Decimal Number data type (which shows up as Double here), and must use Hash encoding. In the Sales Unpivot table, since all the measure values are stored in the Value column, this column has to use the Decimal Number data type and Hash encoding. As this article explains (the Definitive Guide To DAX goes into a lot more detail) Value encoding can give you a lot of performance benefits.

Calculation Complexity

When you start to build more complex DAX calculations then the disadvantages of the Measures Dimension approach become even more apparent. Let’s say you want a visual in your report that shows Sales, Tax and a measure that subtracts Tax from Sales called Sales After Tax:

Here’s the DAX needed for this visual:

Sales Measure = SUM('Sales'[Sales])
Tax Measure = SUM('Sales'[Tax])
Sales After Tax = [Sales Measure] - [Tax Measure]

To achieve the same result with the Measures Dimension approach, though, you need to know how to use the DAX Calculate() function, something like this:

Sales Measure 2 =
CALCULATE (
    SUM ( 'Sales Unpivot'[Value] ),
    KEEPFILTERS ( 'Sales Unpivot'[Measure Name] = "Sales" )
)
Tax Measure 2 =
CALCULATE (
    SUM ( 'Sales Unpivot'[Value] ),
    KEEPFILTERS ( 'Sales Unpivot'[Measure Name] = "Tax" )
)
Sales After Tax 2 = [Sales Measure 2] - [Tax Measure 2]

[Note that in most cases I’d create a separate dimension table for the Measures dimension, but to keep things simple here I’ve not done that]

If you expect other people to build measures on your dataset then this additional complexity can be a significant barrier to overcome. Calculate isn’t an easy function to use properly.

Calculation Performance

Last of all, there’s also also a performance penalty to pay with the Measures dimension. Taking the Sales After Tax example from the previous section, here’s what the Server Timings tab in DAX Studio shows for the query associated with the visual showing Sales, Tax and Sales After Tax:

Notice that there’s just one Storage Engine query: DAX fusion has kicked in so that the Sales and Tax values required can be retrieved in the same scan.

However, here’s what the Server Timings tab shows for the same visual using the Measures Dimension approach and the second set of measures using the Sales Unpivot table shown above:

Not only is this query slower but there are now two Storage Engine queries: one to get the Sales data and one to get the Tax data. Since separate scans are needed to get each measure value, the more measures you have in a visual or the more measures needed by your calculations, the more scans are needed. This can very quickly add up to a big performance problem, especially if each scan is relatively slow – which is more likely to be the case since the Measures Dimension approach means Power BI is less able to compress data effectively.

Conclusion

As you’ve probably guessed by now I’m not a big fan of the Measures Dimension approach. While there are definitely some advantages to using it I think the disadvantages – which aren’t always immediately obvious – outweigh them.

Dynamically Generated Lines On A Map In Power BI Using DAX, WKT And The Icon Map Custom Visual

I don’t generally blog about data visualisation in Power BI because it’s not my strong point, and I don’t blog about maps and geospatial analysis because I know even less about that subject and there are people like David Eldersveld who cover it so well. I do like playing around with maps though and recently I’ve been having fun with the Icon Map custom visual developed by James Dales. Probably the thing I like most about it is that so many properties accept measures as inputs, which means that you can use DAX to do some interesting things.

In this post I’ll show you a very basic example of how to display dynamically generated lines on a map – specifically, I’ll show you how to draw lines between any four cities that a user selects from a slicer. Pre-calculating all the possible combinations of cities when you load your data is not feasible because of the sheer number, so this is a scenario where being able to dynamically generate the lines between the four selected cities in DAX is really useful.

To start off, I downloaded a table with a list of cities in the UK and their latitudes and longitudes from https://simplemaps.com/data/gb-cities and then loaded this data into four identical tables in Power BI called Start, Second Stop, Third Stop and End.

I then created four slicers from the City column from these four tables, so the user can select the four different cities they want to draw lines between.

The key to making this work is to use the Icon Map visual’s ability to display geometric data in Well Known Text format; documentation on this can be found here. To achieve this I wrote some (slightly over-engineered) DAX in a measure that generates a LINESTRING object that has four lines to connect the four selected cities:

Route = 
var CityTable = 
{
    (SELECTEDVALUE('Start'[Longitude]), 
    SELECTEDVALUE('Start'[Latitude])),
    (SELECTEDVALUE('Second Stop'[Longitude]), 
    SELECTEDVALUE('Second Stop'[Latitude])),
    (SELECTEDVALUE('Third Stop'[Longitude]), 
    SELECTEDVALUE('Third Stop'[Latitude])),
    (SELECTEDVALUE('End'[Longitude]), 
    SELECTEDVALUE('End'[Latitude]))
}
var FilteredCityTable = 
FILTER(
    CityTable, 
    NOT(ISBLANK([Value1])) && 
    NOT(ISBLANK([Value2])))
return
"LINESTRING ( " & 
CONCATENATEX(
    FilteredCityTable, 
    [Value1] & " " & 
    [Value2], ", ") & 
")"

Here’s what the measure returns when the cities of Bath, Birmingham, Banbridge and Bradford are selected:

LINESTRING ( -2.365556 51.379444, -1.916667 52.466667, -6.26701 54.35091, -1.75 53.783333)

Finally, I dragged this measure in the Icon URL / WKT / SVG well in the Icon Map visual (NB: you need to use the version of the visual from http://www.icon-map.com/ and not the version from AppSource for now for this to work). Here’s the result:

You can download the sample pbix file here.

There are far more practical things you can do with this: for example, Mimoune Djouallah has a great blog post here on how to use Well Known Text in Icon Map to display 1.2 million points on a map. Let me know if you have a real-world application for this by leaving a comment.

Make Your Power BI Report Run Faster By Showing The Same Data In Fewer Visuals

Do you have a Power BI report that has a row of card visuals on it, something like this:

Cards

?

It’s a very common thing to do, but if your report is too slow to render you may find that a design like this is part of the problem. Now look at the following report which contains a single matrix that has been configured to look as un-matrix-like as possible but shows the same data:

Matrix

This is likely to perform faster, but why?

Before we go any further, I don’t want you to go and change your reports if you’re not going to get any benefit from doing so. Use Performance Analyzer (as shown here) to determine which visuals on your report are the cause of slow performance – there’s no point redesigning visuals that are fast anyway.

As a general rule the more visuals you put on a report page the slower it’s going to get. It’s logical if you think about it: the more visuals there are, the more queries have to be run against your dataset and the more work Power BI has to do to render the report. I know there is a tendency to try to pack as much information onto a page as possible and this often happens when someone else has designed the report you’re trying to build, but you should always try to resist this. Splitting a single large page into multiple smaller pages, using slicers or filters to reduce the amount of data shown at any one time and avoiding gigantic Excel-like tables are a good idea.

Once you’ve done that you need to see whether you can reduce the number of visuals needed to display the same amount of information: in the example above, the four separate cards show the same data as the single matrix. In part this is because the cards generate four separate DAX queries against the dataset to get the data they need whereas the matrix only generates one DAX query. There is a certain overhead to running a DAX query, so reducing the number of DAX queries needed to get the same amount of data is a good thing. More importantly, in this example Power BI can get the four values required for the single DAX query generated by the matrix much more efficiently than it can in the four separate DAX queries needed by the cards. This is because of something called “DAX Fusion”, which my colleague Phil Seamark blogged about in great detail here, and this can make a signficant difference to performance (Marco Russo’s post on why Analyze In Excel reports may be slower than Power BI reports here also contains some useful information on DAX Fusion).

This is only a simple example; if you want to see a really sophisticated demonstration of replacing several visuals with a matrix I suggest you watch this video by another colleague of mine, Miguel Myers. And of course it doesn’t just apply to cards: any time you can replace multiple visuals with a single visual (remember to be careful using custom visuals though) you’re likely to gain some performance.

The “Visual Has Exceeded The Available Resources” Error In Power BI

Very occasionally you may find that a report that is slow to render – but still renders successfully – in Power BI Desktop shows the error “Visual has exceeded the available resources” for some visuals when published to the Power BI Service:

Main Error

Clicking on “See details” will show a dialog that looks something like this:

SecondError

This visual has exceeded the available resources. Try filtering to decrease the amount of data displayed.

Please try again later or contact support. If you contact support, please provide these details.More details Resource Governing: The query exceeded the maximum memory allowed for queries executed in the current workload group (Requested 1048580KB, Limit 1048576KB).

 

The official Power BI documentation has similar advice to what’s shown in this dialog about what to do here, but what’s really going on?

The information in the “More details” section of the section dialog gives you a clue: in this case it’s resource governance. When you run a DAX query in Power BI it will always use a certain amount of memory; inefficient DAX calculations can cause a query to try to grab a lot of memory. In Power BI Desktop these queries may run successfully but be slow, but the Power BI Service can’t just let a query use as many resources as it wants (if it did, it may affect the performance of other queries being run by other users) so there is a resource governor that will kill queries that are too resource hungry. In the case of the visual above the query behind it tried to use more than 1GB of memory and was killed by the resource governor.

The limits enforced by the resource governor cannot be changed in Power BI shared capacity. The limits are higher in a Premium capacity and vary depending on the capacity size, and if you are a Premium Capacity Admin there are a number of different settings on a capacity you can change that will affect this, described here. For example the Query Memory Limit (%) setting controls the amount of memory that any given query can use before the resource governor kicks in.

To demonstrate this I moved the workspace containing the report in the screenshot above to a Power BI Embedded A4 capacity and then changed the Query Memory Limit % setting to 1.

QueryMemoryLimit

This resulted in the error dialog showing that the resource governor killed the query when it tried to use 1% of the 25GB of RAM available in an A4, which is 262144KB:

LowerLimit

While it may be useful to change this setting if you’re the admin of an overloaded Premium capacity, the real solution is to tune the DAX calculations used in this visual (I suggest you read the section on materialisation starting on p568 of “The Definitive Guide to DAX” or this older article which covers the same ground) so they use less memory and return faster or change what the visual displays.

You may see the “Visual has exceeded the available resources” error in other scenarios where the resource governor is not involved. For example, setting the Query Timeout setting on the capacity to 5 seconds like so:

TimeoutSetting

…means that the visual now times out after five seconds instead, and the “See details” dialog shows the following error instead:

Timeout

More details The XML for Analysis request timed out before it was completed. Timeout value: 5 sec.

 

Power BI Report Builder And RSCustomDaxFilter

If you’re building DAX queries using Power BI Report Builder you might notice something that looks like a new DAX function called RSCustomDaxFilter. For example, here’s a simple DAX query built from the Adventure Works Tabular demo database, with one measure and one parameter built Calendar Year:

image

Note that the “Enable Multi Value Parameters” option has been selected. Here’s what you’ll see if you view the text of the DAX query in the Query Designer:

EVALUATE 
SUMMARIZECOLUMNS(
RSCustomDaxFilter(
 @DateCalendarYear,
 EqualToCondition,
 [Date].[Calendar Year],
 Int64
 ), 
"Internet Total Sales", 
[Internet Total Sales]
)

But what is RSCustomDaxFilter? If you run your report and see what happens on the server using SQL Server Profiler, here’s the query that actually gets run (in this case I selected the years 2013 and 2014 for the parameter):

EVALUATE
SUMMARIZECOLUMNS (
    FILTER (
        VALUES ( 'Date'[Calendar Year] ),
        ( 'Date'[Calendar Year] = VALUE ( "2013" ) )
            || ( 'Date'[Calendar Year] = VALUE ( "2014" ) )
    ),
    "Internet Total Sales", [Internet Total Sales]
)

What has happened is that RSCustomDaxFilter has been replaced with an expression using the DAX Filter() function that implements the filter on the selected years; it’s just a placeholder for a dynamically-generated DAX expression that is substituted in at runtime.

Why is it needed? Handling multi-value parameters is difficult in DAX when you don’t know how many values are going to be passed to the parameters (it’s a subject I’ve blogged about here and here) and some kind of dynamic code generation is a reasonable solution to this problem.

There is one drawback with this approach though – it can generate a DAX query that is too long to be executed. Here’s the error message you’ll see:

image

The specified query is too complex to be evaluated as a single statement.

I’m not sure what the maximum length of a query is in DAX – I suspect it’s 32768 characters. You’ll hit this limit if you create a parameter on a field with thousands of values in and then select all but a few of these values; from the example above you can imagine how long the resulting DAX query could be:

image

There’s no single workaround for this but some custom DAX (such as the example here) will be necessary; you’ll just need to ensure that the DAX query is as short as possible.

Power BI And Case Sensitivity

Case sensitivity is one of the more confusing aspects of Power BI: while the Power Query engine is case sensitive, the main Power BI engine (that means datasets, relationships, DAX etc) is case insensitive. Imke Feldmann shows how to make Power Query case insensitive through custom M here; in this blog post I’m going to concentrate on what case insensitivity means in datasets and DAX and show a way to (kind of) work around it.

Starting with a simple example, let’s say you have a data source that contains data that looks like this:

image

This is what the data looks like in the Power Query Editor which is, as I said, case sensitive – which means that it sees “A” and “a” as different characters, so you see six distinct characters in each row of the table. This could be data from any data source; to make things easy I’ve used the following M query to generate this table:

let
    Source = 
    #table(
        type table 
        [MyTextColumn = text], 
        {
            {"A"},
            {"a"},
            {"B"},
            {"b"},
            {"C"},
            {"c"}
        })
in
    Source

When this data gets loaded into a Power BI dataset, however, you’ll see the following in the Data pane of the main Power BI window:

image

Because the main Power BI engine is case insensitive – so for example “a” and “A” are considered as the same character – when this data is loaded in, it only sees three distinct characters and you can’t be sure whether you’ll get the lower case or upper case character stored twice. This is just the way it works; while an instance of Analysis Services Tabular, which is basically the same engine that is found in Power BI, can be either case sensitive or case insensitive, this option isn’t available in Power BI. There is an idea to support case sensitivity as an option in Power BI here and while there are some valid scenarios where you need this, in my experience case sensitivity causes a lot of problems so I can see why it wasn’t a priority.

What’s more you’ll find that DAX calculations are case insensitive too. If you try the following DAX expression to create a calculated table:

Demo Table = 
DATATABLE(
    "MyTextColumn",
    STRING,
    {
        {"A"},
        {"a"}
    }
)

You’ll get this:

image

DAX measures behave in a similar way. Given the following table:

image

If you use the following measure in a table visual:

Demo Measure = 
IF(
    SELECTEDVALUE('Demo Table 2'[MyIntColumn])=1,
    "A",
    "a"
)

…you’ll see:

image

The only way you can work around this case insensitivity is to make text values that would otherwise look the same to the Power BI engine somehow different. One way of doing this would be to add some extra characters to your text. You might think adding some extra spaces would be the way to go; revisiting the first M query shown above, you could add a space to every lower case character in the table like so:

let
    Source = 
    #table(
        type table 
        [MyTextColumn = text], 
        {
            {"A"},
            {"a "},
            {"B"},
            {"b "},
            {"c "},
            {"C"}
        })
in
    Source

But this doesn’t work because another little-known behaviour of the Power BI engine is that all leading and trailing spaces are trimmed when text is loaded. Anyway, spaces may not be visible but they still take up… well space. A better option – and one that actually works – is to use the Unicode Zero-Width Space character and the M Character.FromNumber function instead like so:

let
    Source = 
    #table(
        type table 
        [MyTextColumn = text], 
        {
            {"A"},
            {"a" & Character.FromNumber(8203)},
            {"B"},
            {"b" & Character.FromNumber(8203)},
            {"c" & Character.FromNumber(8203)},
            {"C"}
        })
in
    Source

The great thing about this character is that although it’s there, it’s invisible and takes no space. When you load this last M query into Power BI you see the following table:

image

Let’s finish off with a more advanced example of how to use this rather excellent character. Say you have the following source data returned by an M query called SourceData:

image

Of course, when this table is loaded into Power BI, you see the following:

image

BUT, the following M query takes this table and for each row goes through the text in the OriginalText column and adds a zero-width space after each lower case character:

let
    Source = 
    SourceData,
    ToList = 
    Table.AddColumn(
        Source, 
        "Chars", 
        each Text.ToList([OriginalText])
        ),
    LowerCaseChars = 
    {"a".."z"},
    AddInvisibleChars = 
    Table.AddColumn(
        ToList, 
        "AddInvisibleChars", 
        each 
        List.Transform(
            [Chars], 
            each 
            if 
            List.Contains(LowerCaseChars, _) 
            then 
            _ & Character.FromNumber(8203) 
            else _
            )
            ),
    RecombineList = 
    Table.AddColumn(
        AddInvisibleChars, 
        "OutputText", 
        each 
        Text.Combine([AddInvisibleChars]), 
        type text
        ),
    RemovedOtherColumns = 
    Table.SelectColumns(
        RecombineList,
        {"OutputText"}
        )

in
    RemovedOtherColumns

When you load the output of this query into Power BI, because the zero-width spaces after each lower case character make Power BI see each piece of text as being different, it looks like case has been preserved even when it hasn’t:

image

You can download the Power BI file for this last example here.

[Thanks to Greg Galloway, Akshai Mirchandani and Jeffrey Wang for their help with this post]

Power BI Custom Format String Examples, Part 2: Dates And Times

In my last post I showed lots of examples of how Power BI’s new custom format string feature can be used to format numbers. This post, looking at dates and times, will be a bit different for two reasons: there are a lot more useful examples of custom date and time formats built into Power BI Desktop, and some of the format placeholders listed in the VBA documentation aren’t supported in Power BI. As a result I’m going to concentrate on some useful formats that aren’t covered well by the examples and highlight a few things that aren’t possible right now.

Here’s the table containing the sample data, in the form of date/time values, for my examples:

image

[Note that the dates shown above are formatted in dd/mm/yyyy format]

The reason I’ve used date/time values for my examples is that they can be used to demonstrate formats for values of data type date and data type time, as well as data type date/time. As with my previous post I’m going to create a series of measures to show the effects of different format strings, each with the same DAX definition:

Eg1 = SELECTEDVALUE(Examples[DateAndTime])

With the default format of :

image

…applied, here’s what the output looks like in a Power BI report:

image

Dates

Let’s start by looking at date formats. The first thing to point out is that you can format a date/time so it only shows the date part and not the time and vice versa. For example, applying the custom format string:

dd/mm/yyyy

image

…where dd is day number, mm is month number and yyyy is a four digit year, gives you:

image

If you’re American and want your months to come before your days you can simply swap the dd and the mm, for example with the format string:

mm/dd/yyyy

image

You’re not forced to use a / as your separator in a date; in fact you can use any character. For example, the custom format:

dd*mm*yyyy

…gives you:

image

If you don’t want a leading zero in front of your day or month number you can use a single d or m, and if you want a two-digit year you can use yy instead of yyyy. So, for example:

d/m/yy

…gives you:

image

You can add full day names and month names using dddd and mmmm, so the format:

dddd dd mmmm yyyy

…gives you:

image

You can also get abbreviated day and month names using ddd and mmm, so:

ddd dd mmm yyyy

…gives you:

image

Last of all, for dates, if you want to make it clear that you’re not showing really, really, really old data you can put a g on the end of your date format like so:

dd/mm/yyyy g

…and you get the following:

image

Although the VBA documentation talks about showing day number of week, week number and quarter number, I haven’t found a way of making that work in Power BI (although I may have missed something and it may be possible).

Times

Times are a bit more straightforward. The main placeholders are hh for hours, mm or nn for minutes and ss for seconds. So for example:

hh:mm:ss

and

hh:nn:ss

…both give you times formatted using the 24-hour clock as follows:

image

If you prefer to use the 12-hour clock you can add AM/PM onto the end of your format string, so:

hh:nn:ss AM/PM

…gives you:

image

The VBA documentation has several variants on AM/PM with slightly different outputs, but this is the only format that I could make work in Power BI.

The very last thing to mention is that, at the time of writing, although Power BI can store times with millisecond values there is no way to make milliseconds appear in a formatted time or date/time. For example in all the screenshots above there are two rows displaying 11/11/2019 15:15:15; they appear as different rows in the table because the millisecond values for each are different, but there’s no way of formatting these values to show that they are different. Hopefully this will be rectified soon; in the meantime you will need to store the millisecond part of any time or date/time separately in a different column in your dataset if you want to display it.

You can download the example .pbix file for this post here.

Power BI Custom Format String Examples, Part 1: Numbers

Now that we can apply custom format strings to fields and measures in Power BI in the September 2019 release, I thought it would be useful to provide some examples of what’s possible with this very flexible new feature because the existing documentation for VBA isn’t easy to make sense of. In fact there’s so much to say I’m going to have to write a series of blog posts to cover everything! In this first post I’m going to look at formatting numbers.

First of all, here’s the source data I’m going to use for my examples:

image

I’m going to create a whole series of identical measures defined like this:

SalesEg1 = SUM('ExampleTable'[Sales])

…and apply different custom format strings to each one so you can compare the output in a Power BI table visual. For reference, here’s what a blank custom format gives you with this measure:

image

image

Let’s start with the basics of formatting numeric values. The first thing to point out is that custom format strings are built up using a series of placeholder characters that allow you to control things like thousands separators, the number of decimal places, whether digits are displayed in a placeholder and so on.

Setting the number of decimal places

As you can see in the screenshots above, two of the values have four decimal places but by default only two decimal places are shown. To always show three decimal places, use the following format string:

0.000

image

Here’s the result:

image

In this case the 0 is a placeholder for a digit that must always be displayed and the . is the decimal separator; three 0s after the . means you always get three decimal places for non-blank numeric values.

Optional digits

You may have noticed in the last screenshot that all numbers show three decimal places, even the value for Pears and the Total. If you don’t want the decimal places to appear – or indeed you don’t want a digit to appear in a particular place if it’s a zero – you can use a # character as a placeholder instead. The following format string:

0.###

…always shows a zero before the decimal separator, but will only show the decimal places if they aren’t zeroes:

image

Thousands separators

If you want to display a thousands separator in your numbers you can use a comma placeholder in your format string, like so:

#,0.###

image

Percentages

If you have values that you want to display as percentages, you can use the % placeholder as follows:

#,0.###%

image

Notice that two things have happened here:

  • A percentage sign has been added to the end of each value
  • The values appear to have been multiplied by 100. They actually haven’t, but the percentage format makes them look as though they have been. Any calculations that reference this measure will still get the unmultiplied value as you would expect.

Currency symbols

If you want currency symbols to appear in your format string you can just add them in either before or after the main part of your format string. For example to put a UK pound sign in a format string you can use the following:

£#,0.00

image

Different formats for positive values, negative values and zeroes

If you need to format positive values, negative values and zeroes differently, you can add up to three different sections to your custom format string separated by a semi colon, as follows:

#,0.0;-#,0.000;0

image

In this case notice how the positive values have one decimal place, the negative value has three decimal places and the zero has no decimal places. In Analysis Services Multidimensional it used to be possible to add a fourth section to format blanks/nulls, but that does not seem to work here unfortunately…

Formatting negative values with parentheses

A common requirement in financial reporting is to format negative values with parentheses (round brackets) instead of a minus sign, and that’s possible with custom format strings. For example:

#,0.0;(#,0.000);0

image

Other Text

You can escape individual characters in your format string by preceding them with a \ placeholder. Say you wanted a # to actually appear in your formatted output and not have it considered as a placeholder, you could use the following:

\##,0.00

image

You can also include whole chunks of text by putting it in double quotes, like so:

“Positive”;”Negative”;”Zero”

image

That’s enough for today; tune in for my next post with even more examples!

You can download the sample pbix file for this post here.

The Second Edition Of “The Definitive Guide To DAX” Is Out!

If you’re a Power BI fan there are three possible answers to the question “Did you know that the second edition of The Definitive Guide To DAX has just been published?”:

Answer#1: Yup, I’ve already got my copy!

If this is your answer there’s no need to read any further.

Answer #2: What’s “The Definitive Guide To DAX”?

If, on the other hand, you’re new to Power BI and this is what you’re thinking then I should explain that “The Definitive Guide To DAX” is a book by Marco Russo and Alberto Ferrari and is what its title suggests it is – the sum total of human knowledge about the DAX calculation and query language used by Power BI, written by the two people who know most about it outside the development team. Marco and Alberto are friends of mine but I don’t think anyone can accuse me of bias when I say that it’s a book that every Power BI developer needs to own, so go out and buy it! If you use Power BI you need to learn DAX and while this book may not be a simple step-by-step tutorial it has in it somewhere answers to just about every question you’ll ever ask about DAX – and, more importantly, the answers it has are as correct and as up-to-date as they possibly can be. I can tell you that it’s proved invaluable to me in my work at least twice in the last week alone.

Answer #3: Yes, I saw that but I already have the first edition – is it worth buying this one too?

This is a slightly more difficult question to answer, but I’m still going to recommend that you buy the second edition. As Marco says in his announcement blog post, a lot of the existing content has been updated and rewritten and a lot of new content has been added. If you care about following all the latest DAX best practices and you don’t want the new hire in your department to mock you because you’ve never heard of DAX Studio, you need to buy this new edition.

[Note: I didn’t get a free copy of this book for review (yet?) but I have an O’Reilly Online Learning account which means I could read it as soon as it was published]

PS I know someone needs to write the “Definitive Guide to M” but it’s not going to be me, at least not right now.

DAX Median() Function Does Not Work On Tables With More Than 2 Billion Rows

An interesting – if obscure – fact I learned recently is that a small number of DAX functions such as Median() do not work on tables with more than 2 billion rows in Analysis Services Tabular, Azure AS and Power BI.

It’s quite easy to reproduce in Power BI. The following M expression returns a table with two billion and four rows:

let
    Source = 
    List.Repeat(
        {1,2,3,4},
        500000001
        ),
    #"Converted to Table" = 
    Table.FromList(
        Source, 
        Splitter.SplitByNothing(), 
        null, 
        null, 
        ExtraValues.Error
        ),
    #"Changed Type" = 
    Table.TransformColumnTypes(
        #"Converted to Table",
        {{"Column1", Int64.Type}}
        )
in
    #"Changed Type"

It takes some time to load this table  – around twenty minutes – but because there are only four distinct values in the table the resulting .pbix file is only 31KB thanks to the way Power BI compresses data.

If you load this table into your dataset, call it VeryBigTable and create the following measure:

Median Test = MEDIAN(VeryBigTable[Column1])

…and use the measure in a visual, you’ll see the following error:

image

The current query cannot be evaluated for the ‘VeryBigTable (42)’ table, because the table contains more than two billion rows.

What’s more, the error will always occur even if you apply a filter to the table that returns less than two billion rows. The same problem occurs with some other functions, such as Percentile(), but it’s worth pointing out that the vast majority of DAX functions work as normal with tables with more than two billion rows – for example, in the pbix file used here the Sum() and CountRows() functions not only work fine but return instantly.

Luckily, in the case of the Median() function, there is an easy workaround because you can calculate a median in other ways such as the one described on the DAX Patterns site here. The code is a lot more verbose but it works on a 2 billion+ row table.

image

%d bloggers like this: