Two New Books: “The Definitive Guide To DAX” And “’M’ Is For Data Monkey”

I’m not going to pretend that this blog post is a properly impartial review – I know the authors of both of these books to varying degrees – but I thought it was worth writing a few words on two new books I’ve acquired recently which are worth additions to any Power BI enthusiast’s bookshelf or e-reader.

The Definitive Guide To DAX

Something I’ll never understand about my friends Marco Russo and Alberto Russo is their love of writing books – they generally have a new one out every year, sometimes two (personally I find writing books painful). Their latest publication is “The Definitive Guide To DAX” and it does indeed live up to its title. No-one outside the dev team comes close to Marco and Alberto’s knowledge of DAX, the language of Power Pivot, Power BI Desktop modelling and SSAS Tabular, and in this book they have documented everything that they know about it down to the smallest detail. Want to know what the KeepFilters() function does? Or the GenerateAll() function? How about all the new DAX functions and features in the latest versions of Power BI Desktop which will also appear in SSAS 2016 Tabular? They’re all here, and more. As such this is essential purchase for anyone doing serious work on the Microsoft BI platform, although probably more as a reference than a book to read end-to-end. It’s fair to say there’s a certain amount of overlap between this and some of their previous books on Power Pivot and SSAS Tabular, but the language – and the community’s understanding of it – has evolved sufficiently to justify buying this book too.

[I received a free copy of this book for review]

Buy it here from Amazon UK | US

‘M’ Is For Data Monkey

As the author of the only other book on Power Query, I suppose I should really be keeping quiet about “’M’ Is For Data Monkey” in case you buy it instead of mine. However 18 months of UI changes and functionality improvements mean my book is now a bit out-of-date, and what’s more important is that Ken Puls and Miguel Escobar have had the advantage of a lot of real-world experience with Power Query that I didn’t have (indeed no-one had) when I was writing in early 2014. The book itself is not a formal introduction to the M language but a guide to what you can do with it in Power Query; while a lot of what’s here will be useful in Power BI this is definitely a Power Query book and the target audience is Excel Pros rather than BI Pros. The decision to focus on Excel Pros was a good one to make, in my opinion, because it plays to the authors’ strengths and means that the book has a very practical focus. A lot of the tips and tricks here are ones I’ve used successfully myself, and I don’t mind admitting that I learned one or two things from this book as well.

Buy it here from Amazon UK | US

Other Books Are Available…

There are a couple of other new books out that, although I haven’t seen them, will also be worth checking out. Rob Collie has just released Power Pivot and Power BI, essentially the second edition of DAX Formulas For Power Pivot; Matt Allington has just released Learn To Write DAX; both are going to be good choices for Excel users wanting a DAX tutorial. Finally, last week Teo Lachev announced on his blog that he has published the world’s first dedicated Power BI book. Teo is another author whose books I admire so I’m sure it will be excellent, although I’ll be interested to see how he handles the problem of writing about a product that changes so much so quickly.

Power BI Desktop, Sort By Column And DAX Calculations That Use The All() Function

Recently I came across a problem where a DAX measure gave different results in Excel and Power BI Desktop. It turned out not to be a bug or even a feature, but since it confused me for a few minutes I thought it was worth writing up in case anyone else ran into it.

Consider a model built in Excel using Power Pivot from the following two tables:

image

image

With two measures defined in the model as follows:

Sum of Sales:=SUM(Sales[Sales])

Share:=DIVIDE([Sum of Sales], CALCULATE([Sum of Sales], ALL(Month[Month Name])))

…and, importantly, the Sort By Column property on the Month Name column set to Month Number:

image

…it’s possible to build a PivotTable that looks like this:

image

However, when you import the same model into Power BI Desktop and recreate the PivotTable above in the Report view you’ll see that the Share calculation no longer gives the same values:

image

What’s the problem here? It’s all down to the way Power BI Desktop generates DAX queries when you set the Sort By Column property. The Excel PivotTable above generates the following MDX:

SELECT 
{[Measures].[Sum of Sales],[Measures].[Share]} 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,MEMBER_VALUE,
HIERARCHY_UNIQUE_NAME ON COLUMNS , 
NON EMPTY 
Hierarchize(
{DrilldownLevel({[Month].[Month Name].[All]},,,INCLUDE_CALC_MEMBERS)}) 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,MEMBER_VALUE,
HIERARCHY_UNIQUE_NAME ON ROWS  
FROM [Model] 
CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, 
BACK_COLOR, FORE_COLOR, FONT_FLAGS

On the rows axis, as you would expect, the only hierarchy you see is Month Name.

However, if you run a Profiler trace (you can find out how to do this here although it’s much easier to get the Process ID from DAX Studio) to look at the DAX query generated by Power BI you’ll see

EVALUATE
TOPN (
    102,
    SUMMARIZECOLUMNS (
        ROLLUPADDISSUBTOTAL (
            ROLLUPGROUP ( 'Month'[Month Name], 'Month'[Month Number] ), 
            "IsGrandTotalRowTotal"
        ),
        "Share", 'Sales'[Share],
        "Sum_of_Sales", 'Sales'[Sum of Sales]
    ),
    [IsGrandTotalRowTotal], 0,
    'Month'[Month Number], 1,
    'Month'[Month Name], 1
)
ORDER BY
    [IsGrandTotalRowTotal] DESC,
    'Month'[Month Number],
    'Month'[Month Name]

The difference here is that the Month Number and Month Name fields are both present – they have to be since the query has to sort by Month Number. In MDX the order of members on a hierarchy can be set inside the model; in a DAX query you can only sort using an ORDER BY clause and for that to work, the field you’re ordering by must be present in the query.

The Share measure calculation needs to be changed in order to fix this, then. Here’s one way of doing this:

Share =

DIVIDE([Sum of Sales],

CALCULATE([Sum of Sales], ALL(Month[Month Name], ‘Month'[Month Number])))

Working With Compression In Power Query And Power BI Desktop

If you’re reading this post there’s one important question you’ll probably want to ask: is it possible to extract data from a zip file in Power Query/Power BI? The answer is, unfortunately, no (at least at the time of writing). As this answer from Tristan on the dev team explains, because there are so many flavours of zip file out there it’s an extremely difficult problem to solve – so it hasn’t been attempted yet. That said, there are two other mildly interesting things to learn about compression in Power Query/Power BI Desktop that I thought were worth blogging about…

The first is that Power Query/Power BI can work with gzip files. For example, given a gzip file that contains a single csv file, here’s an example M query showing how the Binary.Decompress() function can be used to extract the csv file from the gzip file and then treat the contents of the csv file as a table:

let
    Source = Binary.Decompress(
     File.Contents(
      "C:\Myfolder\CompressedData.csv.gz"),
       Compression.GZip),
    #"Imported CSV" = Csv.Document(Source,
     [Delimiter=",",Encoding=1252]),
    #"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV"),
    #"Changed Type" = Table.TransformColumnTypes(
     #"Promoted Headers",{
      {"Month", type text}, {" Sales", Int64.Type}
      })
in
    #"Changed Type"

The other is that you’ll see Binary.Decompress() used when you import an Excel workbook that contains a linked table into Power BI Desktop. For example, consider an Excel workbook that contains the following table:

image

If this table is imported into the Excel Data Model as a linked table, and you then save the workbook and try to import it into Power BI using File/Import/Excel Workbook Contents:

image

… you’ll see this message:

image

Click Start and you’ll get another message:

image

If you choose the Copy Data option, the data from the Excel table will be copied into Power BI. But where is it stored exactly? A look in the Query Editor at the query that returns the data shows that it’s embedded in the M code itself:

let
    Source = Table.FromRows(
     Json.Document(
      Binary.Decompress(
       Binary.FromText(
        "i45WMlTSUTJSitWJVjIGskzALFMgy0wpNhYA", 
        BinaryEncoding.Base64), 
       Compression.Deflate)), 
      {"A","B"}),
    #"Changed Type" = Table.TransformColumnTypes(
     Source,{{"A", Int64.Type}, {"B", Int64.Type}})
in
    #"Changed Type"

That big chunk of text in the middle of the Source step is the data from the Excel table stored as a compressed JSON document, and again Binary.Decompress() is used to extract this data.

The Text.Format() Function In Power BI/Power Query M

New functions are appearing in M all the time, many with no announcement or documentation. I came across Text.Format() the other day and I thought it was worth blogging about because I can see a lot of uses for it: it makes it very easy to insert values into a piece of text.

The function signature is:
Text.Format(formatString as text, arguments as any, optional culture as nullable text)

Here’s a simple example:

Text.Format("The #[Animal] sat on the #[Furniture]", [Animal="cat", Furniture="mat"])

It returns the text:

The cat sat on the mat

image

As you can see, the references to each record field in the first piece of text are replaced with the values from those fields from the record in the second parameter. Those of you who know a little M will realise how this works: the placeholder in the text passed to the first parameter is actually the same M expression you would use to extract the value you need from the record in code. So [Animal] is the M expression you’d use to return the value from the Animal field from the record [Animal="cat", Furniture="mat"], as in following expression which returns the text value “cat”:

let
    MyRecord = [Animal="cat", Furniture="mat"],
    GetAnimal = MyRecord[Animal]
in
    GetAnimal

The second parameter can take other data types too. You can pass a list instead of a record; so for example the expression

Text.Format(

"The first number is #{0}, the second number is #{1}, the third number is #{2}",

{5,8,9})

returns the text

The first number is 5, the second number is 8, the third number is 9

The optional third parameter of Text.Format() controls the locale/culture used when formatting the values. So for example the expression

Text.Format(

"Decimal example #{0} – Date example #{1}", {100.001, #date(2015,12,1)},

"en-us") 

returns a decimal number and date formatted for US English, with a full stop (or period, as the Americans say) as the decimal separator and the date shown as mm/dd/yyyy:

Decimal example 100.001 – Date example 12/1/2015

While the expression

Text.Format(

"Decimal example #{0} – Date example #{1}", {100.001, #date(2015,12,1)},

"fr-fr") 


Returns the same values formatted for French/France, where the decimal separator is a comma and dates are formatted dd/mm/yyy:

Decimal example 100,001 – Date example 01/12/2015

How about some more advanced examples? Here’s a table in Excel:

image

If you load it into Power Query and then create a custom column, in your custom column expression you can refer to the current row as a record using the _ (underscore) character. So creating a custom column using the following expression:

Text.Format("the #[Animal] sat on the #[Furniture]", _)

image

Returns a table that looks like this:

image

You could also use Text.Format() to create parameterised queries to run against a database. Here’s an example of an MDX query on the Adventure Works DW database with a parameterised WHERE clause:

let
    MDXQueryTemplate = 
	"SELECT#(lf)
	{[Measures].[Internet Sales Amount]} ON 0,#(lf)
	NON EMPTY#(lf)
	[Date].[Calendar Year].[Calendar Year].MEMBERS ON 1#(lf)
	FROM#(lf)
	[Adventure Works]#(lf)
	WHERE(
	STRTOMEMBER(
	""[Customer].[Country].&[#[Country]]"", 
	CONSTRAINED))",
    ReplaceCountryParameter = 
	Text.Format(
		MDXQueryTemplate, 
		[Country="Australia"]),
    RunQuery = 
	AnalysisServices.Database(
		"localhost", 
		"adventure works dw 2008", 
		[Query=ReplaceCountryParameter])
in
    RunQuery

Remember, if you do something like this you’ll probably want to disable native database prompts – if you don’t, you’ll be asked to approve every new query that gets run. Also, you’ll notice that I’m using the StrToMember() function with the Constrained flag in the WHERE clause because, even though it’s not really necessary, it’s good from a security point of view. It would be really good if we could use proper MDX parameters in our queries but I don’t think it’s possible, unless there’s some other new feature or function that I don’t know about.