Power Query Comes To Azure Data Factory With Wrangling Data Flows

One of the many big announcements at Build this week, and one that caused a lot of discussion on Twitter, was about Wrangling Data Flows in Azure Data Factory. You can read the blog post here:

https://azure.microsoft.com/en-us/blog/analytics-in-azure-remains-unmatched-with-new-innovations/

…but what isn’t clear from this is that it’s basically Power Query Online integrated into ADF. You can see it in action by watching the following video – the demo of Wrangling Data Flows starts at around the 21 minute mark:

https://mybuild.techcommunity.microsoft.com/sessions/76997

image

As the presenter says, the Power Query Online editor generates M in the background as you would expect and “we are going to take this M and translate it into Spark and run it over big data”. Query folding to Spark, basically. More technical detail about all this is available here:

https://github.com/gauravmalhot/wranglingdataflow

…including a document discussing which M functions currently support query folding and which ones as yet don’t. Obviously, this feature will only work well if as much query folding as possible takes place.

This feels like a much more significant win for team Power Query than the integration with SSIS that was announced recently, if only because SSIS is a bit legacy and ADF is the cool new thing. I wonder if this opens up the possibility of integration between Power BI dataflows and ADF in the future, as another example of how self-service BI solutions can be easily transitioned into centrally-managed, enterprise-grade BI solutions? If that happens I hope someone sorts out the dataflow/data flow naming mess.

You can sign up for the preview of Wrangling Data Flows here.

%d bloggers like this: