When DirectQuery mode for Power BI was announced I assumed it was just the next version of SSAS Tabular DirectQuery mode with the same extra features that we’ll get in SSAS 2016 (such as better SQL generation, and other benefits enabled by Super-DAX). If it was just that I would have been happy, but there’s something else that was mentioned in Miguel’s video introducing the feature that I almost missed, something that is also hinted at in the documentation where it mentions the following limitation:
If the Query Editor query is overly complex an error will occur. To remedy the error you must: delete the problematic step in Query Editor, or Import the data instead of using DirectQuery
It turns out that Power BI in DirectQuery mode is actually SSAS DirectQuery version 2.0 combined with Power Query/Power BI “Get Data”’s query folding capabilities (where the logic in your queries is pushed back to the data source rather than evaluated inside Power BI) – which is quite interesting.
Let’s look at an example using the Adventure Works DW database and SQL Server. If you import just the DimDate table in DirectQuery mode and create a table that shows the count of values from the DateKey column grouped by CalendarYear, like this:
The following SQL will be generated:
SELECT TOP (1000001) [t0].[CalendarYear] AS [c15], COUNT_BIG([t0].[DateKey]) AS [a0] FROM ( (select [$Table].[DateKey] as [DateKey], [$Table].[FullDateAlternateKey] as [FullDateAlternateKey], [$Table].[DayNumberOfWeek] as [DayNumberOfWeek], [$Table].[EnglishDayNameOfWeek] as [EnglishDayNameOfWeek], [$Table].[SpanishDayNameOfWeek] as [SpanishDayNameOfWeek], [$Table].[FrenchDayNameOfWeek] as [FrenchDayNameOfWeek], [$Table].[DayNumberOfMonth] as [DayNumberOfMonth], [$Table].[DayNumberOfYear] as [DayNumberOfYear], [$Table].[WeekNumberOfYear] as [WeekNumberOfYear], [$Table].[EnglishMonthName] as [EnglishMonthName], [$Table].[SpanishMonthName] as [SpanishMonthName], [$Table].[FrenchMonthName] as [FrenchMonthName], [$Table].[MonthNumberOfYear] as [MonthNumberOfYear], [$Table].[CalendarQuarter] as [CalendarQuarter], [$Table].[CalendarYear] as [CalendarYear], [$Table].[CalendarSemester] as [CalendarSemester], [$Table].[FiscalQuarter] as [FiscalQuarter], [$Table].[FiscalYear] as [FiscalYear], [$Table].[FiscalSemester] as [FiscalSemester] from [dbo].[DimDate] as [$Table]) ) AS [t0] GROUP BY [t0].[CalendarYear]
Then, if you go to Edit Queries and set a filter on EnglishDayNameOfWeek so that you only get the dates that are Fridays, like so:
The click Close And Apply, you’ll see that the table now shows the count of dates in each year that are Fridays (as you would expect):
…and the SQL generated also reflects that filter:
SELECT TOP (1000001) [t0].[CalendarYear] AS [c15], COUNT_BIG([t0].[DateKey]) AS [a0] FROM ( (select [_].[DateKey], [_].[FullDateAlternateKey], [_].[DayNumberOfWeek], [_].[EnglishDayNameOfWeek], [_].[SpanishDayNameOfWeek], [_].[FrenchDayNameOfWeek], [_].[DayNumberOfMonth], [_].[DayNumberOfYear], [_].[WeekNumberOfYear], [_].[EnglishMonthName], [_].[SpanishMonthName], [_].[FrenchMonthName], [_].[MonthNumberOfYear], [_].[CalendarQuarter], [_].[CalendarYear], [_].[CalendarSemester], [_].[FiscalQuarter], [_].[FiscalYear], [_].[FiscalSemester] from [dbo].[DimDate] as [_] where [_].[EnglishDayNameOfWeek] = 'Friday') ) AS [t0] GROUP BY [t0].[CalendarYear]
What’s happening here is that the output of “Get Data” (we so need a better name for this feature – how about “the functionality formerly known as Power Query”?) becomes the inner SELECT statement with the filter on EnglishDayNameOfWeek; while the table in the report that returns the count of dates by Year is responsible for generating the outer SELECT statement with the GROUP BY (this is the part of the Power BI engine that is related to SSAS DirectQuery).
Now, you can only do this if all the steps in “Get Data” can be folded back to SQL. How do you know if they can or not? Well, if query folding can’t take place then you’ll get an error: this is what is meant by the warning about your query being “overly complex” in the documentation. Unfortunately there’s no way of knowing in advance what can be folded and what can’t; with every release of Power BI Desktop and Power Query I’ve noticed that more and more things can be folded (and I’m always pleasantly surprised at the transformations that can be folded, such as pivots and basic calculated columns), but there are still plenty of limitations. For example at the time of writing adding an index column to your query will prevent query folding and therefore break DirectQuery. If you do this, you’ll see the following error in the Query Editor window:
This step results in a query that is not supported in DirectQuery mode
Even with this restriction I think the ability to apply transformations in Get Data is very useful indeed, because it means you have a lot of scope for cleaning and filtering data in DirectQuery mode and therefore building ‘live’ reporting solutions on data that isn’t modelled the way you’d like it to be.
While I’m talking about DirectQuery mode, there are a few other points I’d like to mention:
- Remember, it’s still in Preview and so it has some limitations and bugs. For example, I’ve hit an issue where DirectQuery fails with a connection from my “Recent Sources” list, but works ok if I create a new connection.
- Prepare to be underwhelmed by the performance of DirectQuery, people: remember this is just ROLAP by another name, ROLAP has been around for years, and ROLAP has always had performance problems. These problems are not just related to the speed of the underlying engine or the size of the data – the ability of the OLAP engine to generate the SQL to get the data it needs also plays a major role. SSAS Multidimensional ROLAP and SSAS Tabular 2012-4 DirectQuery generate some pretty wretched SQL even in the most simple scenarios and it looks like Power BI DirectQuery is a big improvement on them. But what about more complex queries? This is a very difficult problem to crack. My feeling is that if your data does fit into Power BI’s native engine then you should import it rather than use DirectQuery, if you want to get the best possible query performance.
- I suspect that this performance issue is also the reason why the New Measure button is greyed out in Power BI Desktop when you’re in DirectQuery mode. This isn’t a limitation of the engine because SSAS Tabular does support more complex DAX measures in DirectQuery mode, albeit with some restrictions on the functions you can use. However, the more complex your DAX measures are, the more complex the problem of generating SQL becomes and the more likely your queries are to be slow. I don’t think this is a good reason for completely preventing users from creating their own measures though: there are lots of scenarios where you will need to create measures and performance would still be acceptable. Maybe this is an example of an ‘advanced’ feature that could be switched on by power users?