In the Power BI/Fabric community everyone is excited about the recent release of Semantic Link: the ability to analyse Power BI data easily using Python in Fabric notebooks. Sandeep Pawar has an excellent blog post here explaining what this is and why it’s so cool. Meanwhile in the Excel community, everyone is excited about the new integration of Python into Excel. But can you analyse Power BI data in Excel using Python? Yes you can – so as my teenage daughter would say, it’s time for a crossover episode.
Let’s see a simple example. The main problem to solve is how to get data from Power BI into Excel in a format that Python in Excel can consume easily, ie a table rather than a PivotTable. The easiest way to do this is using the new Excel Connected Table feature, described here, which creates a live connection back to Power BI so when the data in the dataset changes the data in Excel is updated too. I have a Power BI dataset published to the Service that contains data from the UK government’s Land Registry Price Paid data which has details of all the property transactions in England and Wales; I found this in Excel and clicked Insert Table:
I then defined a query that found the number of property transactions and average price paid broken down by county:
This gave me a table, connected to the dataset in the Power BI Service using a DAX query, in my worksheet:
I renamed this table to “Sales”:
Unfortunately you can’t change the ugly column names without rewriting the DAX query behind the table, which makes life more difficult later on.
Then, on a new worksheet, I created a Python code cell using the PY function and entered the following Python code:
df=xl("Sales[#All]", headers=True) s = plt.scatter(df.iloc[:,1], df.iloc[:,2], marker=11) s.axes.xaxis.set_label_text("Count Of Sales") s.axes.yaxis.set_label_text("Average Price Paid")
[I’ve only just started learning Python so please excuse any newbie bad practices/mistakes here! The main point is to visualise the data returned from Power BI]
This gave me a scatterplot with each county as a marker, the count of sales measure on the x axis and the average price paid measure on the y axis:
A few comments:
- The xl function allows you to reference Excel cells, ranges and tables in your code; the reference to Sales[#All] gets the whole of the Sales table, including headers; adding headers=True means the table headers are recognised as such
- Dealing with those ugly column names in Python is such a pain that I copped out and referenced the columns by position
- After entering the code and committing it, you also need to tell Excel to treat the output as an Excel Value rather than a Python object to see the scatterplot; you also need to resize the cell
A second way of getting data into Excel from Power BI is to export the data from a published Power BI report. If you’re going to do that, you should export as a connected table so again the data stays connected to the source Power BI dataset.
There’s also a third , slightly different way of getting data from Power BI into Excel that is possible if you have Premium and which is a bit more complex but also more flexible: you can use Power Query, but maybe not in the way you would expect. The xl function can reference the output of a Power Query query even if that query is not loaded to a worksheet or the Excel Data Model – which I think is a nice touch and important if you’re working with larger data volumes.
To get data from Power BI into Excel using Power Query you need to use Power Query’s Analysis Services connector to connect to your workspace’s XMLA Endpoint. Go to the Data tab in Excel, click the Get Data button then From Database/From SQL Server Analysis Services Database (Import):
On the connection dialog the XMLA Endpoint goes into the Server box, the name of the dataset goes into the Database box and you can paste a DAX query into the MDX or DAX query box:
There are several benefits to using this approach:
- You can use your own DAX query rather than have one generated for you
- You can easily edit the DAX query after you have created the Power Query query
- You can rename the query as well as all those ugly column names, making them easier to work with in Python – I named my query SalesByCounty and renamed my columns to County, CountOfSales and AveragePricePaid
I then closed the Power Query Editor without loading the output of the query anywhere.
You can read more about how to use Power Query queries in Python in Excel here.
Finally, here’s the modified version of the Python code to create the scatterplot shown above:
df=xl("SalesByCounty") s = plt.scatter(df.CountOfSales, df.AveragePricePaid, marker=11) s.axes.xaxis.set_label_text("Count Of Sales") s.axes.yaxis.set_label_text("Average Price Paid")
Note how, in the first line, I can reference the Power Query query by name in the xl function and how, in the second line, renaming the columns in Power Query makes writing the Python code much easier.
Is this actually going to be useful to anyone? Well if Python in Excel is going to be used, it will be used by data analysts who love both Excel and Python – and who are also likely to use Power BI too. As Sandeep argues in the blog post about Semantic Link referenced above there are several reasons why these data analysts should use a Power BI dataset as a data source for their work rather than going back to the raw data: for example they can be sure they are using exactly the same data that is being used in their reports and they can use measures defined in the dataset rather than have to recreate the same calculations in their own code. While Semantic Link in Fabric is much more powerful than anything you can do in Excel with Power BI data, it’s only available in Fabric notebooks and this needs a Fabric or Premium capacity; this technique is available to anyone who has Python in Excel and works with Power BI Pro as well as Premium. So yes, in conclusion, I think there are some practical uses for this.