Analysing Power BI DMV Queries In Power BI Desktop

Experienced Analysis Services and Power Pivot developers will know all the interesting things you can learn from DMV queries. For example, in this blog post Kasper de Jonge shows how to use a DMV to find out which columns are using the most memory in a Power Pivot model; here’s an older post from me on using DMVs to show the dependencies between DAX calculations. There’s plenty of other interesting stuff about how data is stored and so on that you can find with a bit of digging, and there’s some reasonably up-to-date documentation on DMVs for SSAS here.

However, running DMV queries against a Power BI Desktop model (which of course runs a local version of the same engine that powers Analysis Services Tabular and Power Pivot) and more importantly doing something useful with the information they return, isn’t straightforward. You can run DMV queries from DAX Studio but that will only give you the table of data returned; you need to copy and paste that data out to another tool to be able to analyse this data. Instead it’s possible to use Power BI Desktop’s own functionality for connecting to Analysis Services to connect to its own local data model and run DMV queries.

If you’re connecting to an instance of Analysis Services in Power BI Desktop you need a server name and a database name and the same goes when you’re connecting Power BI Desktop to itself. This episode of Adam Saxton’s excellent Guy In A Cube YouTube show details how to connect SQL Server Profiler to Power BI Desktop in order to run a trace, and the method he uses to find the connection details also works for our purpose here; it’s definitely worth a watch because it goes into a lot of detail. However it’s much easier to get the connection details from DAX Studio using the following steps (thanks to Darren Gosbell for pointing this out):

  1. Open Power BI Desktop and load data as normal.
  2. Open DAX Studio and choose the Power BI Designer data source option:

    image

  3. Look in the bottom right-hand corner of the screen and you’ll see the server name, including the all-important port number, that you have connected to. In this case it’s
    localhost:52981

    image

  4. Run the following DMV query in a DAX query window. This will give you the nasty GUID that is the name of the only database in the Power BI data model:
    SELECT [CATALOG_NAME] FROM $SYSTEM.DBSCHEMA_CATALOGS
    

    image

  5. Now go back to Power BI Desktop, click the Get Data button and choose Analysis Services. In the connection dialog enter just the server name and port number and the database name found in the previous steps, as well as your DMV query:

    image

Once you have done this, you can load the results of your DMV query into the data model and analyse the results just like any other data. Here’s an M query using the DMV from the blog post by Kasper referenced earlier that looks at the memory used by just one table in the model:

let
    Source = AnalysisServices.Database(
     "localhost:52981", 
     "1107a5df-3dd9-4c16-80b6-cf571c4e794f", 
     [Query="SELECT dimension_name, attribute_name, DataType,
                   (dictionary_size/1024) AS dictionary_size 
                   FROM $system.DISCOVER_STORAGE_TABLE_COLUMNS"]),
    #"Filtered Rows" = Table.SelectRows(Source, 
     each ([dimension_name] = "FactInternetSales"))
in
    #"Filtered Rows"

And here’s a bar chart built from that query showing clearly that the SalesOrderNumber column in the FactInternetSales table is very expensive:

image

Before you get too excited about this, there are two major problems you need to be aware of when using this technique:

  1. The port number used to connect to the local data model and the database name will change every time you open the Power BI Desktop file, so you will have to edit the connection information manually if you want to refresh the data after reopening.
  2. You won’t be able to make this connection refresh once you have published the file to PowerBI.com – so this will only work on the desktop.

That said, I think this is still very useful for development purposes. At some point I expect we’ll probably get a range of management reports in PowerBI.com that show similar data on published reports – just like we used to have in the old Office 365 Power BI.

The Table.Schema() Function In Power BI/M

Yet another new M function for you this week: the Table.Schema() function, which returns information about the columns in a table. There’s some fairly detailed documentation about what it returns here; a simple demo is always a lot more helpful though, I think.

If you connect to the Adventure Works DW database in SQL Server and import the DimDate table, you’ll get an M query that returns the contents of that table (along with some extra columns that describe the relationships between that table and the others in the database):

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data]
in
    dbo_DimDate

image

If you add an extra step to this query that calls the Table.Schema() function on the table returned by the dbo_DimDate step, like so:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    GetSchema = Table.Schema(dbo_DimDate)
in
    GetSchema 

…you get a table with one row for each column in the table returned by dbo_DimDate, and a lot of columns that give you information on each column such as its position in the table, its M data type, its data type in the original data source and so on:

image 

I can think of a lot of uses for this. Documenting a database is an obvious one; it would also serve as a much richer source of data when checking for changes in the structure of a data source, as I described here. Also, given how easy it is to tell whether two tables contain the same data in M, you could use this function to compare whether two tables have the same columns like so:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    SomeOtherTable = 
     Source{[Schema="dbo",Item="SomeOtherTable"]}[Data],
    TablesEquivalent = 
     (Table.Schema(dbo_DimDate)=Table.Schema(SomeOtherTable ))
in
    TablesEquivalent

If you want more detail when doing comparisons you can do that with a little bit more M code, but that’s probably a topic for another post..

Descriptive Statistics In Power BI/M With Table.Profile()

As Buck Woody notes here, when you are exploring a new data set it can be useful calculate some basic descriptive statistics. One new M function that appeared in Power BI recently can help you to do this: Table.Profile(). This function takes a value of type table and returns a table that displays, for each column in the original table, the minimum, maximum, average, standard deviation, count of values, count of null values and count of distinct values (but no mode or median?). So, given the following table:

…the Table.Profile() function returns the following table:

Of course you could create something similar yourself fairly easily (as I have done for a customer in the past), and it’s not as sophisticated as the Quick Insights feature, but it’s handy to have a single function that does all this.

You could even use it on all of the tables in a SQL Server database. Since the Sql.Database() function returns a table containing all of the tables and views in a database, like so:

image

All you need to do to use Table.Profile() on all these tables is to add a new custom column that calls this function for every value in the Data column:

image

Then finally expand the new custom column and you’ll see the stats for every column in every table:

image

Here’s the code:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    #"Added Custom" = Table.AddColumn(Source, "Profile", 
      each Table.Profile([Data])),
    #"Expanded Profile" = Table.ExpandTableColumn(#"Added Custom", 
      "Profile", 
      {"Column", "Min", "Max", "Average", "StandardDeviation", "Count", "NullCount", "DistinctCount"}, 
      {"Column", "Min", "Max", "Average", "StandardDeviation", "Count", "NullCount", "DistinctCount"})
in
    #"Expanded Profile"

Two New Books: “The Definitive Guide To DAX” And “’M’ Is For Data Monkey”

I’m not going to pretend that this blog post is a properly impartial review – I know the authors of both of these books to varying degrees – but I thought it was worth writing a few words on two new books I’ve acquired recently which are worth additions to any Power BI enthusiast’s bookshelf or e-reader.

The Definitive Guide To DAX

Something I’ll never understand about my friends Marco Russo and Alberto Russo is their love of writing books – they generally have a new one out every year, sometimes two (personally I find writing books painful). Their latest publication is “The Definitive Guide To DAX” and it does indeed live up to its title. No-one outside the dev team comes close to Marco and Alberto’s knowledge of DAX, the language of Power Pivot, Power BI Desktop modelling and SSAS Tabular, and in this book they have documented everything that they know about it down to the smallest detail. Want to know what the KeepFilters() function does? Or the GenerateAll() function? How about all the new DAX functions and features in the latest versions of Power BI Desktop which will also appear in SSAS 2016 Tabular? They’re all here, and more. As such this is essential purchase for anyone doing serious work on the Microsoft BI platform, although probably more as a reference than a book to read end-to-end. It’s fair to say there’s a certain amount of overlap between this and some of their previous books on Power Pivot and SSAS Tabular, but the language – and the community’s understanding of it – has evolved sufficiently to justify buying this book too.

[I received a free copy of this book for review]

Buy it here from Amazon UK | US

‘M’ Is For Data Monkey

As the author of the only other book on Power Query, I suppose I should really be keeping quiet about “’M’ Is For Data Monkey” in case you buy it instead of mine. However 18 months of UI changes and functionality improvements mean my book is now a bit out-of-date, and what’s more important is that Ken Puls and Miguel Escobar have had the advantage of a lot of real-world experience with Power Query that I didn’t have (indeed no-one had) when I was writing in early 2014. The book itself is not a formal introduction to the M language but a guide to what you can do with it in Power Query; while a lot of what’s here will be useful in Power BI this is definitely a Power Query book and the target audience is Excel Pros rather than BI Pros. The decision to focus on Excel Pros was a good one to make, in my opinion, because it plays to the authors’ strengths and means that the book has a very practical focus. A lot of the tips and tricks here are ones I’ve used successfully myself, and I don’t mind admitting that I learned one or two things from this book as well.

Buy it here from Amazon UK | US

Other Books Are Available…

There are a couple of other new books out that, although I haven’t seen them, will also be worth checking out. Rob Collie has just released Power Pivot and Power BI, essentially the second edition of DAX Formulas For Power Pivot; Matt Allington has just released Learn To Write DAX; both are going to be good choices for Excel users wanting a DAX tutorial. Finally, last week Teo Lachev announced on his blog that he has published the world’s first dedicated Power BI book. Teo is another author whose books I admire so I’m sure it will be excellent, although I’ll be interested to see how he handles the problem of writing about a product that changes so much so quickly.

Power BI Desktop, Sort By Column And DAX Calculations That Use The All() Function

Recently I came across a problem where a DAX measure gave different results in Excel and Power BI Desktop. It turned out not to be a bug or even a feature, but since it confused me for a few minutes I thought it was worth writing up in case anyone else ran into it.

Consider a model built in Excel using Power Pivot from the following two tables:

image

image

With two measures defined in the model as follows:

Sum of Sales:=SUM(Sales[Sales])

Share:=DIVIDE([Sum of Sales], CALCULATE([Sum of Sales], ALL(Month[Month Name])))

…and, importantly, the Sort By Column property on the Month Name column set to Month Number:

image

…it’s possible to build a PivotTable that looks like this:

image

However, when you import the same model into Power BI Desktop and recreate the PivotTable above in the Report view you’ll see that the Share calculation no longer gives the same values:

image

What’s the problem here? It’s all down to the way Power BI Desktop generates DAX queries when you set the Sort By Column property. The Excel PivotTable above generates the following MDX:

SELECT 
{[Measures].[Sum of Sales],[Measures].[Share]} 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,MEMBER_VALUE,
HIERARCHY_UNIQUE_NAME ON COLUMNS , 
NON EMPTY 
Hierarchize(
{DrilldownLevel({[Month].[Month Name].[All]},,,INCLUDE_CALC_MEMBERS)}) 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,MEMBER_VALUE,
HIERARCHY_UNIQUE_NAME ON ROWS  
FROM [Model] 
CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, 
BACK_COLOR, FORE_COLOR, FONT_FLAGS

On the rows axis, as you would expect, the only hierarchy you see is Month Name.

However, if you run a Profiler trace (you can find out how to do this here although it’s much easier to get the Process ID from DAX Studio) to look at the DAX query generated by Power BI you’ll see

EVALUATE
TOPN (
    102,
    SUMMARIZECOLUMNS (
        ROLLUPADDISSUBTOTAL (
            ROLLUPGROUP ( 'Month'[Month Name], 'Month'[Month Number] ), 
            "IsGrandTotalRowTotal"
        ),
        "Share", 'Sales'[Share],
        "Sum_of_Sales", 'Sales'[Sum of Sales]
    ),
    [IsGrandTotalRowTotal], 0,
    'Month'[Month Number], 1,
    'Month'[Month Name], 1
)
ORDER BY
    [IsGrandTotalRowTotal] DESC,
    'Month'[Month Number],
    'Month'[Month Name]

The difference here is that the Month Number and Month Name fields are both present – they have to be since the query has to sort by Month Number. In MDX the order of members on a hierarchy can be set inside the model; in a DAX query you can only sort using an ORDER BY clause and for that to work, the field you’re ordering by must be present in the query.

The Share measure calculation needs to be changed in order to fix this, then. Here’s one way of doing this:

Share =

DIVIDE([Sum of Sales],

CALCULATE([Sum of Sales], ALL(Month[Month Name], ‘Month'[Month Number])))

The Text.Format() Function In Power BI/Power Query M

New functions are appearing in M all the time, many with no announcement or documentation. I came across Text.Format() the other day and I thought it was worth blogging about because I can see a lot of uses for it: it makes it very easy to insert values into a piece of text.

The function signature is:
Text.Format(formatString as text, arguments as any, optional culture as nullable text)

Here’s a simple example:

Text.Format("The #[Animal] sat on the #[Furniture]", [Animal="cat", Furniture="mat"])

It returns the text:

The cat sat on the mat

image

As you can see, the references to each record field in the first piece of text are replaced with the values from those fields from the record in the second parameter. Those of you who know a little M will realise how this works: the placeholder in the text passed to the first parameter is actually the same M expression you would use to extract the value you need from the record in code. So [Animal] is the M expression you’d use to return the value from the Animal field from the record [Animal="cat", Furniture="mat"], as in following expression which returns the text value “cat”:

let
    MyRecord = [Animal="cat", Furniture="mat"],
    GetAnimal = MyRecord[Animal]
in
    GetAnimal

The second parameter can take other data types too. You can pass a list instead of a record; so for example the expression

Text.Format(

"The first number is #{0}, the second number is #{1}, the third number is #{2}",

{5,8,9})

returns the text

The first number is 5, the second number is 8, the third number is 9

The optional third parameter of Text.Format() controls the locale/culture used when formatting the values. So for example the expression

Text.Format(

"Decimal example #{0} – Date example #{1}", {100.001, #date(2015,12,1)},

"en-us") 

returns a decimal number and date formatted for US English, with a full stop (or period, as the Americans say) as the decimal separator and the date shown as mm/dd/yyyy:

Decimal example 100.001 – Date example 12/1/2015

While the expression

Text.Format(

"Decimal example #{0} – Date example #{1}", {100.001, #date(2015,12,1)},

"fr-fr") 


Returns the same values formatted for French/France, where the decimal separator is a comma and dates are formatted dd/mm/yyy:

Decimal example 100,001 – Date example 01/12/2015

How about some more advanced examples? Here’s a table in Excel:

image

If you load it into Power Query and then create a custom column, in your custom column expression you can refer to the current row as a record using the _ (underscore) character. So creating a custom column using the following expression:

Text.Format("the #[Animal] sat on the #[Furniture]", _)

image

Returns a table that looks like this:

image

You could also use Text.Format() to create parameterised queries to run against a database. Here’s an example of an MDX query on the Adventure Works DW database with a parameterised WHERE clause:

let
    MDXQueryTemplate = 
	"SELECT#(lf)
	{[Measures].[Internet Sales Amount]} ON 0,#(lf)
	NON EMPTY#(lf)
	[Date].[Calendar Year].[Calendar Year].MEMBERS ON 1#(lf)
	FROM#(lf)
	[Adventure Works]#(lf)
	WHERE(
	STRTOMEMBER(
	""[Customer].[Country].&[#[Country]]"", 
	CONSTRAINED))",
    ReplaceCountryParameter = 
	Text.Format(
		MDXQueryTemplate, 
		[Country="Australia"]),
    RunQuery = 
	AnalysisServices.Database(
		"localhost", 
		"adventure works dw 2008", 
		[Query=ReplaceCountryParameter])
in
    RunQuery

Remember, if you do something like this you’ll probably want to disable native database prompts – if you don’t, you’ll be asked to approve every new query that gets run. Also, you’ll notice that I’m using the StrToMember() function with the Constrained flag in the WHERE clause because, even though it’s not really necessary, it’s good from a security point of view. It would be really good if we could use proper MDX parameters in our queries but I don’t think it’s possible, unless there’s some other new feature or function that I don’t know about.

New SSAS, Power BI And SQL Server Training Courses For 2016

I’ve just added a number of new 2016 training course dates to the Technitrain site. If you’re looking for Power BI, Analysis Services, SQL Server or Data Science training in London then please check them out! Also, if you’d like to sign up for the Technitrain newsletter to stay up-to-date with our news you can do so here.

Here are more details on the new courses:

SQL Server Performance Tuning and Internals Boot Camp, Bradley Ball, 15-19 February 2016, London
This 5-day hands-on course is designed to provide DBAs with the tools and knowledge that are required to keep their SQL Servers running efficiently and reliably.
http://technitrain.com/coursedetail.php?c=60&trackingcode=CWB

Introduction To Power BI, Chris Webb, 22-23 February 2016, London
A two-day introduction to Microsoft’s exciting new BI tool suitable for BI professionals, analysts, report developers or anyone interested in using it to build reports or dashboards.
http://technitrain.com/coursedetail.php?c=65&trackingcode=CWB
 
Real World Cube Design And Performance Tuning With SSAS Multidimensional,  Chris Webb, 11-13 April 2016, London
A course aimed at intermediate-to-experienced Analysis Services Multidimensional developers, looking at more advanced cube design topics and query performance tuning.
http://technitrain.com/coursedetail.php?c=66&trackingcode=CWB

Mastering DAX, Marco Russo, 20-22 June 2016, London
A three-day introduction to the DAX language used by Power BI, Analysis Services Tabular models and Power Pivot.
http://technitrain.com/coursedetail.php?c=63&trackingcode=CWB

Optimising DAX, Marco Russo, 23-24 June 2016, London
An advanced two-day course for experienced DAX developers who wish to learn how to optimise DAX calculations and queries for performance.
http://technitrain.com/coursedetail.php?c=64&trackingcode=CWB

Introduction to MDX, Chris Webb, 4-6 July 2016, London
A three day course designed for those with little or no experience of MDX, this course will teach you how to write MDX queries and calculations for Analysis Services.
http://technitrain.com/coursedetail.php?c=67&trackingcode=CWB
 
Practical Data Science with Cortana Analytics, Rafal Lukawiecki, 24-27 October 2016, London
This course is aimed at analysts, analytical power users, predictive developers, BI power users and developers, budding data scientists and consultants.
http://technitrain.com/coursedetail.php?c=68&trackingcode=CWB