Last Friday’s big news was the release of the first CTP for Analysis Services v.next. Among several major new pieces of functionality (Ragged hierarchies! Drillthrough that works properly, even for calculations! Table-level security!) probably the biggest is the integration of Power Query/M into Analysis Services. As you can probably guess, I’m incredibly pleased that my two favourite technologies have got together. The technical details are given in this blog post, which I suggest you read if you haven’t done so already, but what I think is missing is an explanation of why this is so important and what kind of opportunities it opens up – hence this post. Of course this is just my take on the subject and not what Microsoft may actually thinking; it’s also very early days, so as the functionality develops and I have more chance to think about this my opinions may change. If you have any ideas on this subject I would be interested to hear them so please leave a comment!
Why this had to happen: Power BI
There is an obvious reason why Microsoft decided to integrate Power Query/M into SSAS, and that is because it needs to support the conversion of Power BI models into Analysis Services Tabular models. There are two scenarios where this will be necessary.
The first is the ability to convert a Power BI model into an Azure Analysis Services Tabular model (listed as ‘planned’ here), something that will be a key selling point for Azure Analysis Services when it releases. The engine behind Power BI is essentially the same as the one used in Analysis Services so migrating the data model should be straightforward, but since Power BI uses Power Query/M to load data then a migrated Azure Analysis Services model will also have to use Power Query/M.
The second scenario is similar to the first. We now know that on-premises Power BI will be delivered through Reporting Services v.next, and it’s reasonable to assume Reporting Services will need a database engine to store the data for published Power BI reports. That engine will have to be an Analysis Services instance of some kind (either standalone or running in-process inside Reporting Services) and again for that to work Analysis Services will have to support the same data access mechanisms as Power BI.
Better support for a larger number of data sources
I’ve just argued why Microsoft was obliged to include this functionality in SSAS v.next but in fact there are many positive reasons for doing this too. The most obvious one is to do with support for more data sources. At the moment SSAS Tabular supports a pretty good range of data sources, but the world of BI is getting more and more diverse and in order to stay relevant SSAS needs to support far more than it does today. By using Power Query/M as its data access mechanism, SSAS v.next will immediately support a much larger number of data sources and this number is going to keep on growing: any investment that Microsoft or third parties make for Power BI in this area will also benefit SSAS. Also, because Power Query/M can query and fold to more than just relational databases, I suspect that in the future this will allow for DirectQuery connections to many of these non-relational data sources too.
Different data sources for partitions in the same table
Another benefit of this change is that we’ll have a lot more flexibility with partitioning tables in an SSAS Tabular model. As the blog post says:
As long as a partition’s M query adheres to the column mappings of the table, you are free to perform any transformations and pull in data from any data source defined in the model.
In SSAS 2016 the partitions in a table all have to get data from the same data source whereas in v.next we’ll be able to get data from different data sources in different partitions, and this opens up some interesting new possibilities. For example, I can imagine a simple budgeting application where the partitions in a table get data from different Excel workbooks stored in OneDrive for Business, and where the each partition gets processed automatically when changes are saved to one of these workbooks.
Does this replace SSIS and my data warehouse?
The short answer is no. Power Query/M is not a full-featured ETL tool and I don’t think it ever will be; it certainly does not have the kind of functionality needed to perform enterprise-level ETL right now. My view is that Microsoft have built Power Query/M into SSAS for the reasons above and not to encourage enterprise SSAS users to do their own quick-and-dirty ETL when loading data (although there is a risk that that will happen anyway). That said, I think the dividing line between corporate and self-service BI will become increasingly blurred over the next few years as the Microsoft BI stack develops, and we’ll see Analysis Services being used in self-service scenarios as well as the more traditional corporate ones.
Centralised data source objects
One last thing to point out is that the way SSAS v.next makes a distinction between data sources and other queries is very interesting. In Power BI and Power Query it’s easy to end up with data source connection information duplicated across multiple queries unless you know what you’re doing, and this can cause no end of problems later on in a project. As far as I can see, in SSAS v.next a “data source object” is an M query that only contains the connection to external data, while all other queries have to reference a data source to be able to access external data. This means, as the blog post says:
Referring to data source objects helps to centralize data source settings for multiple queries and simplifies deployments and maintenance if data source definitions must be updated later on. When updating a data source definition, all M queries that refer to it automatically use the new settings.
I wonder whether this concept is coming to Power BI and Power Query at some point? I hope so – it makes a lot of sense.