Value.ResourceExpression() And Query Folding In Power BI

When the June update for Power BI Desktop was released, I was extremely excited to discover an undocumented new feature that allows you to see when query folding is taking place while loading data in the Query Editor. Devin Knight beat me to blogging about it with an excellent post that I strongly recommend you to read:

https://devinknightsql.com/2016/07/03/power-bi-checking-query-folding-with-view-native-query/

As Devin says, this is very important because it’s the first time that we as developers can check to see whether query folding is taking place within the Power BI Query Editor (I guess this feature will come to Power Query soon, maybe sometime in autumn 2016), and getting query folding working is the key to getting good performance when loading data.

Looking at this more closely, I’ve found that it’s also possible to get the query generated in code as well as just seeing it in a dialog box. For example, consider the following M code that returns a few columns and some filtered rows from the DimDate table in the Adventure Works DW SQL Server database:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = 
	Table.SelectColumns(
		dbo_DimDate,
		{"DateKey", "FullDateAlternateKey", 
		"DayNumberOfWeek"}),
    #"Filtered Rows" = 
	Table.SelectRows(
		#"Removed Other Columns", 
		each ([DayNumberOfWeek] = 1))
in
    #"Filtered Rows"

image

You can confirm that query folding is taking place and see the SQL that is being generated for the Filtered Rows step by right-clicking on it in the Applied Steps pane and choosing View Native Query:

image

image

If you now add two lines of code to the query you can get the SQL query for the #”Filtered Rows” step as a text value using the Value.ResourceExpression() function, like so:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = 
	Table.SelectColumns(
		dbo_DimDate,
		{"DateKey", "FullDateAlternateKey", 
		"DayNumberOfWeek"}),
    #"Filtered Rows" = 
	Table.SelectRows(
		#"Removed Other Columns", 
		each ([DayNumberOfWeek] = 1)),
    FindQueryRecord = 
        Value.ResourceExpression(#"Filtered Rows"),
    Output = 
        FindQueryRecord[Arguments]{1}[Value]
in
    Output

image

As far as I can see, the Value.ResourceExpression() function can be used with any expression where query folding is taking place and returns a record which, if you dig deep enough into it contains the SQL query. The record itself seems to contain all the information needed to invoke the new Value.NativeQuery() function which I’ve also been looking at recently – but I won’t be blogging about just yet because I’m told it’s not quite finished.

Why is it useful to get the SQL query as text, rather than just seeing it? Well… I have some ideas, and hopefully some other upcoming blog posts will make things clearer!

Happy First Birthday Power BI!

To mark the first anniversary of Power BI reaching RTM, Paul Turley and Adam Saxton have organised a celebration in the form of coordinated blog posts from the community and a video to say thank you to the Power BI team at Microsoft:

This has been a great year for Power BI and its success is a direct result of all of the hard work that the team have put in. Speaking personally, I am incredibly grateful for all the help and advice that I get on a daily basis from individual Microsoft employees who are often providing it in their own time. My congratulations to James Phillips and to everyone who has worked on Power BI to make it what it is today!

Deprecated And Discontinued Functionality In SSAS 2016

Some time ago I blogged about the deprecated and discontinued functionality in SSAS 2014, so I thought it would be a good idea to follow my last post on what’s new in SSAS 2016 Multidimensional with a discussion of what’s going or gone from it.

The same page that I linked to last time has been updated for 2016, and there are four more subpages with all the details. There’s nothing much interesting to say about the breaking changes (basically AMO has been rejigged) or behaviour changes (there’s no in-place upgrade for Tabular models using DirectQuery – you have to open the project and edit some settings) but the other two pages do have some news worthy of comment:

Discontinued Functionality

Here’s the official definition of a discontinued feature:

A discontinued feature is one that is no longer supported. It might also be physically removed from the product.

A few comments about what is now discontinued:

  1. The Non_Empty_Behavior property for calculated measures. To be honest, I’m happy to see this go: it doesn’t usually make much difference to performance and in most cases people use it incorrectly too. It really should be removed from Form View in the SSDT cube editor.
  2. COM assemblies. Note that this is not the same thing as .NET assemblies like the Analysis Services Stored Procedure Project! If you are using a custom MDX function implemented in a COM assembly you will probably find that the equivalent function implemented in a .NET assembly is a lot slower, but in my experience it’s almost always possible to avoid custom functions completely and use pure MDX – and this will give you the best query performance.
  3. I’m a bit sad to see the CalculationCurrentPass() and CalculationPassValue() functions die because you could do cool things like this with them, but I haven’t actually needed to use them for a long, long time.

Deprecated Functionality

Here’s the official definition of a deprecated feature:

A deprecated feature is a feature will be cut from the product in a future release, but is still supported and included in the current release to maintain backward compatibility. Typically, a deprecated feature is removed in a major release, often within two releases of the original announcement. For example, deprecated features announced in SQL Server 2012 are likely to be unsupported by SQL Server 2016.

A few comments on what’s been deprecated:

  1. The only thing that someone may possibly be using on the list of features that will not be supported in the next major release is linked dimensions; remote partitions and remote linked dimensions were always a very bad idea.
  2. The death of dimension writeback (note: not the same thing as writing values back to cells) is a bit of a shame: I never needed to use it, and most client tools didn’t support it, but it always struck me as one of those features that people might have used more if they had known about it.
  3. I see session cubes (as used by Excel PivotTables’ grouping functionality) will not be supported in a future release, which is probably a good thing given all the problems they cause. However I bet there are a lot of Excel workbooks out there that will be affected when this does go.
  4. Local cubes will also go in a future release, though I doubt anyone uses them anymore. That said, I think it would be really useful if there was something similar that allows you to create a Power Pivot model that was a local copy of an SSAS Tabular database, with the option to filter the data in it. Offline access is not so important these days but this would allow users to create their own customised Power Pivot models from a properly designed, central model rather than always having to start from scratch.
  5. Profiler for trace capture is also deprecated, and I discussed this in-depth here.

 

What’s New In Analysis Services 2016 Multidimensional?

With the recent release of SQL Server 2016 some of you old faithful SSAS Multidimensional users out there might be wondering if it’s worth upgrading. There is an official page describing what’s new in Analysis Services 2016 here:

https://msdn.microsoft.com/en-us/library/bb522628.aspx

…but it’s mostly concerned with SSAS Tabular and, unfortunately, it’s somewhat misleading. For example, it says:

a number of enhancements have been made to multidimensional models; for example, distinct count ROLAP optimization for data sources like DB2 and Oracle, drill-through multi-selection support with Excel 2016, and Excel query optimizations.

In fact the drillthrough/multi-select improvements (which I blogged about here) already shipped as part of SSAS 2014 and are reliant on improvements in Excel 2016 as much as in SSAS; similarly the Excel 2016 query optimisations are not reliant on any changes in SSAS 2016 and will benefit users of all versions of SSAS.

So what has actually changed with SSAS 2016 Multidimensional? I don’t know all the details on every change, but here’s what I know right now:

  • As the above quote shows, there have been improvements in ROLAP distinct count performance for DB2 and Oracle. I’ve also heard there are improvements for the SQL generated in ROLAP mode to take advantage of SQL Server’s columnstore indexes.
  • We now have Database Consistency Checker for SSAS, which you can read more about here or in Dustin Ryan’s post here. For Multidimensional it will only tell you whether your partition indexes are corrupt (it does a lot more for Tabular) and if they are you need to delete the database and then either restore it or re-deploy/reprocess.
  • Extended Events for SSAS now have UI support in SQL Server Management Studio, although the UI still looks like it needs some work. I still see myself using Profiler for my SSAS performance tuning work for the foreseeable future – or at least until I work out what the extra information that Extended Events give you is useful for. I know other people, like Bill Anton, are more excited about Extended Events and their possibilities though.
  • You can now use computer accounts to be members of the Analysis Services Administrators group in SQL Server Management Studio.
  • If you are developing applications that need to create, alter, process or otherwise do stuff to an Analysis Services database in .NET code then you need to be aware that Analysis Management Objects (AMO) has been refactored to include a second assembly which “paves the way for future extensions to AMO, with clear division between generic and context-specific APIs”.
  • There’s a new default setting for the MemoryHeapType server property that helps to avoid memory fragmentation and is relevant to Tabular and Multidimensional; Marco Russo has the details here.

Not the most exciting or inspiring set of changes, I have to say, even if there’s lots of cool new stuff in SSAS Tabular 2016. There are also usually other fixes and improvements like the MemoryHeapType property that get added but are never documented, and when I find out about them I’ll be sure to blog or tweet. If you find any please let me know!

Excel 2016 PivotTable MDX Changes Lead To Big Query Performance Gains

Yesterday, the Excel team announced a bunch of performance improvements in Excel 2016 for PivotTables connected to Analysis Services (Multidimensional or Tabular), Power Pivot and Power BI. If you haven’t read the official blog post already, here it is:

https://blogs.office.com/2016/07/07/faster-olap-pivottables-in-excel-2016/

In that post, Microsoft point out that how much of a performance increase you get will depend on a number of factors. I guess they have to do this to manage expectations. However I’m going to come right out and say that these changes are probably the most important thing that has happened in the last five years for Analysis Services or Power Pivot query performance and if you are using Excel PivotTables with SSAS or Power Pivot for reporting, you should upgrade to Excel 2016 (you’ll need the Office 365 click-to-run version) just for these changes.

The key improvement that Microsoft have made here is to stop Excel bringing back unnecessary subtotal values in the MDX queries that it generates. This has been a gigantic problem for years and several people have blogged about it: I did here, and among other people Rui Quintino and Richard Lees have posts on the subject that are worth reading. A lot of my consultancy work involves SSAS and MDX performance tuning and I think I see around five to ten customers every year who are seriously affected by this – and I’m sure this is just the tip of the iceberg. Even if you don’t think you are affected, you may have users who are putting up with slow refresh times for Excel-based reports without your knowledge.

Let’s go into some technical detail, see what the original problem was and what has changed in Excel 2016. I’m going to provide some examples using Excel 2013/2016 against SSAS Multidimensional but as I said the same problems occur (although less severely) with SSAS Tabular and Power Pivot.

Consider the following Excel 2013 PivotTable connected to the Adventure Works cube:

image

image

It’s typical of the kind of thing an end user wants to build, and there are two things to point out:

  1. Notice there are three fields that have been placed on the Rows axis of the PivotTable: the Gender and Marital Status hierarchies from the Customer dimension, and the Category hierarchy from the Product dimension. It’s pretty common for users to put many more fields together on an axis when they want to display detailed data.
  2. The user has also turned off grand totals and subtotals in the PivotTable so that only the detailed data is shown. This often happens in combination with (1) because if the user only wants to see the detailed data the subtotals and grand totals are a distraction.

In this case the PivotTable layout has been switched Tabular and the Repeat All Item Labels option is selected to make the PivotTable look like a table, but these are just aesthetic changes.

This PivotTable contains 12 rows, but if you look at the MDX query that is generated by Excel 2013 to populate it, it returns 36 rows of data (actually the MDX in this case returns 36 columns, but that’s a quirk of Excel) because the subtotals and grand total that Excel is not displaying are still being returned. Here’s the query:

SELECT 
NON EMPTY 
CrossJoin(
CrossJoin(
Hierarchize(
{DrilldownLevel(
{[Customer].[Gender].[All Customers]}
,,,INCLUDE_CALC_MEMBERS)}), 
Hierarchize(
{DrilldownLevel({[Customer].[Marital Status].[All Customers]}
,,,INCLUDE_CALC_MEMBERS)})), 
Hierarchize(
{DrilldownLevel({[Product].[Category].[All Products]}
,,,INCLUDE_CALC_MEMBERS)})) 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME 
ON COLUMNS  
FROM 
[Adventure Works] 
WHERE ([Measures].[Internet Sales Amount]) 
CELL PROPERTIES VALUE, FORMAT_STRING, 
LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

And here’s a sample of what it returns, with the unwanted values highlighted:

image

You can probably guess that these unwanted subtotals and grand totals make the query slower, and the more fields you put together on the rows or columns of a PivotTable the greater the number of subtotals/grand totals are returned and slower things get. PivotTables connected to SSAS or Power Pivot often become completely unusable with more than six or seven fields put together on the same axis, and the only workarounds before Excel 2016 are either to write custom MDX in a named set (I show how to do this here) or if you are using SSAS Multidimensional try to use member properties instead of separate fields (as Richard Lees shows here), but neither are really great alternatives.

Building the same PivotTable using Excel 2016, however, gives the following MDX:

SELECT 
NON EMPTY 
CrossJoin(
CrossJoin(
Hierarchize(
{[Customer].[Gender].[Gender].AllMembers}), 
Hierarchize(
{[Customer].[Marital Status].[Marital Status].AllMembers})), 
Hierarchize(
{[Product].[Category].[Category].AllMembers})) 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME 
ON COLUMNS  FROM [Adventure Works] 
WHERE ([Measures].[Internet Sales Amount]) 
CELL PROPERTIES VALUE, FORMAT_STRING, 
LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

Which returns just the data needed for the PivotTable:

image

What has changed? Focusing on just the Gender hierarchy, the difference between the two queries is that in Excel 2013 the selection on Gender is given using the DrillDownLevel() function on the All Member of the hierarchy. Isolated and put in its own query, the set expression used looks like this:

SELECT 
{[Measures].[Internet Sales Amount]}
ON COLUMNS,
DrilldownLevel(
{[Customer].[Gender].[All Customers]}
,,,INCLUDE_CALC_MEMBERS)
ON ROWS
FROM
[Adventure Works]

image

Notice how the All Member, used by Excel to return subtotals and grand totals, is returned by the set. Excel 2016 instead just asks for the members on the Gender level of the Gender hierarchy, not including the All Member. Again, put inside its own query, you can see what it returns:

SELECT 
{[Measures].[Internet Sales Amount]}
ON COLUMNS,
{[Customer].[Gender].[Gender].AllMembers}
ON ROWS
FROM
[Adventure Works]

image

Apart from the fact that no All Member is returned, it’s also a lot cleaner and easier to read.

I’ve been very excited about this change since I first heard it was happening, and I think it will benefit a lot of people. A big thanks to Alexander Lahuerta and the rest of the Excel team for making this happen!

What Monthly Updates For SQL Server Management Studio And SSDT Mean For SSAS Developers

Last month I upgraded the laptop I use for teaching to SQL Server 2016 and, as is always the way, not long afterwards I ran into a problem while I was doing a demo during a course: the SSAS Deployment Wizard didn’t work for SSAS 2016 Tabular projects created at the new 1200 compatibility level. I made some enquiries and was told that the tooling team hadn’t had time to do the necessary dev work before RTM but had finished it soon after, and that the new update would be in the July release of SQL Server Management Studio.

This reminded me of something I had heard about but forgotten, thinking it wasn’t all that relevant to me: SQL Server Management Studio is now no longer bundled with SQL Server but is instead a free, separate download that is updated monthly. This official blog post from May has all the important details; Aaron Bertrand’s blog post from April is a great summary of what this means for SQL Server in general (his interview with Kent van Hyning, the Engineering Manager for SQL Server Client Tools is also worth reading); and the SQL Server Release Services blog is the place to look for new update announcements. To get back to my story, when the July update for SSMS was released a few days ago I hit the Check For Updates menu option in SSMS, it updated, and I got a working version of the Deployment Wizard (the Changelog has a complete list of all of the fixes/improvements in this version). The moral of this story is that even though all of the publicity around the new direction for SQL Server Management Studio is directed at the SQL Server relational engine community, it’s equally important for us BI developers – and we are going to have to make the effort to update SSMS regularly from now on.

Of course a similar change has happened with SQL Server Data Tools, which is also now getting monthly updates. I’ve heard a few complaints around bugs and instability in SSDT recently and it looks like these issues are being fixed fairly rapidly: check out the list of SSAS-related fixes in the July update for SSDT in the post on the SSDT team blog and the associated Changelog. Again, as BI developers we’re going to have to get into the habit of updating SSDT on a regular basis.

While getting bugs fixed quickly is great, I really hope that this new focus on tooling means that we get new BI-related features in SSMS and SSDT. SSMS has, in particular, been shockingly bad at supporting BI developers: for example, I find it unbelievable that we still don’t have a DAX query window in SSMS when the language has been in SSAS Tabular since SQL Server 2012 (I know the community have filled the gap with DAX Studio, but that’s not the point). Aaron Nelson recently announced a Trello board where ideas for new features in SSMS can be debated but there’s very little BI-related stuff there at the time of writing.

Finally, a recent Reddit AMA with the SQL Server Tooling Team gave away a few interesting titbits about what might be happening in the future for BI:

  • In response to a question about version control for SSIS/SSAS/SSRS:
    In regards to AS and RS version control. We are considering improvements to AS model where we actually break up the single model file into independent objects (tables, measure, roles, etc…).
  • Regarding MSBuild support for SSAS and SSRS:
    At least for SSAS and SSRS, we have MSBuild support on the backlog, but it isn’t the highest priority at the moment.
    The SSIS team has plans for MSBuild support and are considering to support this in a future release of SSDT. Stay tuned!
  • Regarding tooling support for the new SSAS 2016 1200 compatibility level:
    The plan is to gradually improve the Tabular experiences in SSDT and SSMS, particularly for the new compat level 1200. We now have a better scripting language (TMSL) and a way better object model (TOM) and this sets us up for introducing improvements with the upcoming monthly releases. If everything goes to plan, you will see a significant new capability in the SSDT Tabular August release, and it will give you an idea where things are going. Sorry for not giving more details at this time. Let’s count the chickens when they hatch.
  • On the question of why there’s no support for DAX in SSMS:
    These things simply didn’t fit into the SQL Server 2016 release timeframe. You’ll see improvements in future versions of the tools.

Step Descriptions And Comments In M Code In Power BI Desktop

The June release of Power BI Desktop has what seems to be a fairly unremarkable new feature in that it allows you to add descriptions to each step in a query in the Query Editor window. However the implementation turns out to be a lot more interesting than you might expect: the step descriptions become comments in the M code, and even better if you write M code in the Advanced Editor window your comments appear as descriptions in the Applied Steps pane.

Take the following M query, entered in the Advanced Editor, as an example:

let
    Step1 = 5,
    Step2 = 10,
    Step3 = Step1 * Step2
in
    Step3

There are three variables declared in the let expression which appear as three steps in the Applied Steps pane. The first two steps declare integers and the third multiplies these two integers together, returning 50.

If you right-click on the first step and select Properties, then you can enter a description for the step in the Properties pane that appears:

image

image

After you click OK, the description is visible as a tooltip when you mouse-over the step:

image

If you then open the Advanced Editor window you’ll see the M code for the query has now been changed to include a comment (NB comments in M code start with //):

let
    // Declare the first number
    Step1 = 5,
    Step2 = 10,
    Step3 = Step1 * Step2
in
    Step3

image

If you then edit the M code in the Advanced Editor window to add a comment in the line before a step, like so:

let
    // Declare the first number
    Step1 = 5,
    Step2 = 10,
    // Multiple the two numbers together
    Step3 = Step1 * Step2
in
    Step3

…then this will also show up as a description when you mouse-over the step in the Applied Steps pane:

image

As a result, for anyone like me who writes a lot of M code manually in the Advanced Editor window, this turns out to be a really handy feature.