In my last post I showed how to create a Power Automate custom connector for the new endpoints in the Power BI REST API for exporting a report to a file. In this post I’ll show you how to use this custom connector in a flow in Power Automate.
The three endpoints in the Power BI API that need to be called to export a report (and which have been included in the custom connector from my last post) are as follows:
Export To File In Group must be called first, to start the process of exporting either a Power BI report or a paginated report to a file. Exporting may take some time, though, so the exported file doesn’t get returned at this point. In the headers of the response there is a retry-after value in seconds telling you how long you must wait initially, before…
…calling Get Export To File Status In Group to find out whether the export is ready or not. If it isn’t ready you again need to wait the number of seconds specified in the retry-after header before checking the status again. If it is ready, you can then…
This is all described in the C# example code here, so the challenge is to translate this into a flow. In my case, I want to export the following paginated report to a CSV file:
Here’s what my flow looks like at the top level:
To make things easy to understand I’m using a manual trigger to start the flow and hard-coding the report I want to export. The second Action calls Export To File In Group:
You will need to go into the Settings for this Action and set the Asynchronous Pattern option to Off:
As I said, in the response from this first API call the retry-after header tells you how many seconds you should wait before checking the export status for the first time. This value is used in the next Action, which adds the delay:
Once this delay has passed there’s a Do loop that will call the Get Export To File Status In Group endpoint up to ten times to check if the export is ready:
This call uses the export id found in the response from the call to Export To File In Group:
If this call returns that the status is Running or NotStarted, then it will again wait for the number of seconds specified in the retry-after header; if not then there is no delay:
The expression on the Do loop breaks the loop if the export status is Succeeded or Failed:
After the loop, there is a check to see if the export status is Succeeded:
If the export status is Succeeded, then the file is downloaded and saved to OneDrive for Business:
Otherwise, the flow sends a failure notification email:
And here’s the exported CSV file viewed in Excel Online:
And that’s it. I know this flow isn’t as robust as it could be, but I hope it gives you an idea of how to use the custom connector to export Power BI reports in Power Automate; as I said last time, I have very little Power Automate experience so please forgive any newbie mistakes! If you can see a way to improve on what I’ve done here, please leave a comment.
Automated exports from a Power BI paginated report to a CSV file is already very useful but there are other fun things we can do with this – and in my next post I’ll show you another, less obvious example.
Recently, a new set of endpoints in the Power BI REST API for exporting Power BI reports and paginated reports to files went into public preview (see the main announcement here and the paginated reports announcement here). Since there are all kinds of cool things you can do with this I thought I would write a few posts on how to use these endpoints in Power Automate. In this post I’ll talk about setting up a custom connector in Power Automate; in the next post I’ll talk about how to use this custom connector in Power Automate; and after that I’ll show you some less obvious uses for all this.
Before we go any further, and before you get too excited, you should read the limitations of the public preview listed here, especially those around the number of report pages that can be exported per hour and the number of pages in a report that can be exported. Also, this functionality is only available with Power BI Premium or Power BI Embedded.
Creating a Power Automate custom connector for the Power BI REST API is something that several people have blogged about already in detail. Konstantinos Ioannou has a very detailed walkthrough here; Jese Navaranjan has a video walkthrough here; and I blogged about how you could use the Swagger definition of the Power BI REST API to create a custom connector here. I’m not going to go over all these steps again but there are a few specifics that need pointing out if you want to build your custom connector manually. If you’re lazy, I exported my custom connector to a Swagger file and you can download it here; you should be able to import it and create your own custom connector very easily. I don’t pretend to be a Power Automate expert so please excuse any newbie mistakes!
The three endpoints that you’ll need to use to export to a file (unless your report is in your My Workspace, in which case there are three other equivalent endpoints) are Export To File In Group, Get Export To File Status In Group and Get File Of Export To File In Group. There’s good documentation for regular Power BI reports here and paginated reports here, but in summary Export To File In Group starts the export process but because this might take a long time, doesn’t return the exported file; Get Export To File Status In Group allows you to check the status of the export; and Get File Of Export To File In Group returns the file once the export is ready.
Creating a custom connector in Power Automate is a four-step process and steps 1, 2 and 4 (“General”, “Security” and “Test”) are well covered in the guides above. Step 3 (“Definition”) is where you need to create three Actions for the three endpoints above.
After filling in the General and Security pages, go to the Definition page and click the New action button to create an Action for the Export To File In Group endpoint. You first need to fill in the information in the General section:
Then in the Request section click the Import from sample button, select POST and paste
…into the URL box and the sample payload here into the Body box:
Setting up the other two endpoints is similar except that you need to select GET instead of POST and you don’t need to paste anything into the Body box. Here are the two templatised URLs to use for them:
There’s more to do on this page, but at this point you should go to the Test page and test these three new Actions. For Export To File In Group you need to at least enter a groupId, a reportId and a format:
The groupId and reportId can be found by navigating to your report in the Power BI portal; you can extract the groupId and reportId from the URL like so:
The valid values for format (and the other parameter values) are in the docs.
Assuming that it all works, you will be able to scroll down and see the response. Copy all the JSON in the Body section and save it somewhere temporarily.
Do the same thing for the other two Actions you have created; these two actions take a third parameter called exportId, which is the id value in the response that I’ve highlighted in the screenshot immediately above.
Once you have the Body text from Export To File In Group and Export To File Status In Group (the response from Get File Of Export To File In Group should be your report export), go back to the Definitions page and for these two Actions scroll down to the Response section and click on Add default response.
There are two things to do here for both Actions. First, paste the responses you got on the Test pane into Body; next, paste:
…into the Headers section and then click Import. This will make the output of the Actions much easier to consume in Power Automate later on.
Your custom connector is now ready, and in part 2 of this series I’ll show you how to use it in Power Automate.
[Thanks to Jaime Tarquino and Chris Finlan for their help getting this working]
Now that Power BI is so popular, it’s becoming more and more common for organisations to migrate their reports from other BI platforms to Power BI. And why not? You’ll be moving to a modern, industry-leading, cloud-native BI platform and probably saving a lot of money in licensing costs too. As a Microsoft employee I wholeheartedly encourage this (obviously!) but it’s not without its pitfalls. In this post I’m going to highlight some common mistakes I’ve seen people make when migrating to Power BI so that you can avoid making them yourself. It isn’t always a simple lift-and-shift job…
#1 You may need to remodel your source data
Every report has a data source and getting source data in the right format for your BI platform is a substantial task – so much so, that you might be tempted to put Power BI on top of the data sources you have created for your previous BI platform with no changes. However different BI platforms need their data in different formats. Many BI platforms like their data munged together in one big table, sometimes even with data at different granularities in the same table. Power BI, on the other hand, likes its source data modelled as a star schema (you can find out what a star schema is and why it’s important here). If you don’t model your data as a star schema you may find that you see incorrect values in your reports, that report performance is poor, and that it’s a lot harder to write the DAX calculations that you need.
#2 Understand how Power BI works with data
Even if you have a star schema, you also have to understand that how Power BI works with data may be different to the way your previous BI tool worked. As Marco Russo and Alberto explain here, Power BI wants to you invest time up-front to create a semantic layer and once you have done this you’ll find that building reports and calculations is much quicker and easier. If you’re wondering why you can’t just write a SQL query to get the data for that chart you need to build, you’ve made this mistake.
#3 Don’t try to recreate the exact functionality of your old BI platform
If you’re building a Power BI report to replace an existing report on a legacy platform, and you ask your users what they want the report to look like, the most common reply is “Just like the old one”. This is a danger sign! One surefire way to make a Power BI report hard to build and slow to run is to try to make it do something it wasn’t designed to do.
One example of this is the way some users ask for reports that recreate the gigantic tables they are used to seeing in their old Excel reports. These tables may be slow to render in Power BI (especially if they have hundreds or thousands of rows) and what’s more, they don’t make the data easy to understand – better visualisation options are always available. My colleagues Adam and Patrick discussed this problem in detail in a recent video that’s well worth watching.
Another, more specific example, is the requirement to allow end users to change the measures or fields that are shown in a visual in a Power BI report. This is a feature we have on our public roadmap but I’ve seen many customers create tie themselves in knots trying to implement this using existing functionality; it is possible, but just because something is possible doesn’t mean you should do it. Once again, needless complexity, slower development and performance problems are the result.
Instead what you should be doing is understanding the business problem the report is trying to solve and implementing a solution using the functionality Power BI has built in. I know that users can be very stubborn about things like this, but trust me, it’s better to have the battle now rather than deal with the consequences of doing what they want you to do.
#4 Don’t forget about Analyze in Excel and Paginated Reports
Following on from the last point, if your users want to be able to explore their data by changing the measures and fields used in a visual they are probably thinking of how they use PivotTables and PivotCharts in Excel. And if that’s what they want, why don’t you let them use the real thing? Power BI’s Analyze in Excel feature doesn’t give you something that looks like an Excel PivotTable, it gives you an actual PivotTable connected to data stored in Power BI. It’s the functionality your users are comfortable with none of the downsides of traditional Excel reports such as the tedious, error-prone, manual data refreshes. What’s more you can also use Excel cube functions for more complex report layouts, such as those needed by financial reports. This video by Peter Myers is a great introduction to cube functions – they work with Power BI datasets in the same way they work with Analysis Services cubes or Power Pivot.
Similarly if, after everything I’ve just said, you still want to create reports sourced from hand-written SQL queries with gigantic tables that can be printed easily, you’ll find that Power BI Paginated Reports work much better than regular Power BI reports. Closely related to SQL Server Reporting Services, one of the most popular BI platforms in history, paginated reports have a different set of strengths compared to regular Power BI reports. Use the right tool for the job!
#5 This is your chance to change who does what
Migrating to Power BI is a chance to make a break from the old ways of doing things, and that includes who does all the work. If you’ve worked in BI for any length of time you’ll have seen plenty of examples of these two extreme approaches to BI development and the pain that goes with them:
Corporate BI, where all the work is done by the IT department – which quickly becomes the bottleneck, unresponsive to business needs and unable to understand them properly.
Excel Hell, where the business builds everything itself – and ends up with a lot of duplicated effort, multiple versions of the truth, and reports that break when the person who built them leaves for a new job.
Power BI lets you chart a course between these two. It’s easy enough to use, and affordable enough, for you to deploy it to a larger number of users in your organisation than any other BI tool except Excel. At the same time it gives you the tools you need to avoid the problems of Excel Hell: centralised data, automated refresh, security, monitoring and a lot more. This, for me, is the key to all successful Power BI deployments: empowered users and the IT department working together as a team, each doing what they do best.
Power BI incremental refresh is a very powerful feature and now it’s available in Shared capacity (not just Premium) everyone can use it. It’s designed for scenarios where you have a data warehouse running on a relational database but with a little thought you can make it do all kinds of other interesting things; Miguel Escobar’s recent blog post on how to use incremental refresh for files in a folder is a great example of this. In this post I’m going to show you how to use incremental refresh to solve another very common problem – namely how to get Power BI to keep the data that’s already in your dataset and add new data to it.
I know what you’re thinking at this point: isn’t this what incremental refresh is meant to do anyway? Well, yes it is, but as I said it’s designed to work in scenarios where a relational data warehouse stores a copy of all the data that’s in your dataset and in some cases you don’t have that luxury. For example, you may want to:
Connect to a data source that only gives you a set of new sales transactions each day, and add these sales transactions to the ones you have already stored in a Power BI dataset
Take a snapshot of a data source, like an Excel workbook, that is changing all the time and store each of these daily snapshots of the contents of the workbook in a Power BI dataset
The current Power BI incremental refresh functionality doesn’t make it easy to do either of these things, and that’s why I’ve written this post.
Let me be clear though: in all these cases I recommend that you don’t use the technique I’m going to show you. If possible, you should stage a copy of each day’s data in a relational database (ie build that data warehouse) or even as text files in a folder (Power Automate may be useful to do this) and use that staged copy as the data source for Power BI. This will allow you to do a full refresh of the data in your dataset at any point in the future if you need to, or create a completely new dataset, even though it means you have to do a lot of extra work. If it isn’t possible to do this, or you’re too lazy or you’re just curious to see how my technique works, read on.
For the example I’m going to show in this post I’m going to use a web-based data source, an RSS feed from the BBC News website that returns a list of the current top stores on the site. You can find it here:
RSS is based on XML and there’s no authentication needed to access this feed, so loading a table of these top stories into Power BI is very easy – so easy, I’m not going to bother explaining how to do it. Instead I’m going to show you how to use incremental refresh to store a copy of this list of top stories every day in the same table in a dataset.
First of all, for incremental refresh to work two Power Query parameters of data type date time need to be created called RangeStart and RangeEnd. More details about how to create them can be found in the docs here. Power BI expects them to be used to filter the rows loaded into a table; it also uses them to partition these tables in the dataset.
Next you need a Power Query query to load the news stories. Here’s the M code for the query:
As I said, the first couple of steps aren’t interesting – they just connect to the RSS feed to get the list of top stories as a table. After that:
The CurrentDateTime step gets the current date and time at the point when the dataset refreshes. It’s important to note that I’ve used the DateTimeZone.FixedUtcNow function – this not only gives you the current UTC date and time, but it’s guaranteed to give you the same date and time every time you call it within a query. If you use DateTimeZone.UtcNow you may get a different date time returned every time the function is called within the query, which can make things very confusing.
The PreviousDay step gets yesterday’s date by extracting the date from the value found in the CurrentDateTime step and subtracting one day.
The #”Added Custom” step adds a new column to the table called “UTC Data Load Date” containing the value returned by the CurrentDateTime step. As the name suggests, this column shows when the data in any particular row was loaded.
The #”Filtered Rows” step is extremely important. It filters the rows in the table: it will return all the rows in the table if the RangeStart parameter returns yesterday’s date, otherwise it will return no rows at all. It also has an AND condition that checks whether the RangeEnd parameter is not null, which should always be true. Without this step that uses RangeStart and RangeEnd in some way you won’t be able to turn on incremental refresh; the significance of only returning data for yesterday’s date will become clear later.
At this point, unless you happen to have set the value of the RangeStart parameter to yesterday’s date, the table returned by the query will be empty.
Once you have loaded this table to your Power BI dataset (I called it “News”) you need to configure incremental refresh on it. You can do this by right-clicking on the table in the Fields pane in the main Power BI window and selecting Incremental refresh from the menu. Here’s how you will need to configure the settings:
You can ignore the warning about query folding and the M query at the top. What you need to do here is:
Set the “Incremental refresh” slider to On
Under “Store rows in the last” choose Days from the dropdown and enter the number of days you want to store data for. You probably don’t need to store data indefinitely; if you did, your dataset might get very large. Data that is older than the number of days specified here will be deleted from the dataset.
Under “Refresh rows in the last” again select Days from the dropdown and enter 1.
Do not check the boxes for “Detect data changes” and “Only refresh complete day”.
With that done all you need to do is publish to the Power BI Service and set up scheduled refresh for once a day (and no more). When the report refreshes each day, all the stories listed in the RSS feed will be added to the existing stories in the dataset. To monitor this I created a measure called Story Count to count the number of rows in the News table, and then created a simple report that showed the total value of this measure and also showed it broken down by the UTC Data Load Date column. Here’s what it looked like yesterday (the 12th of April 2020), after it had already been refreshed for a few days:
and here’s what it looked like today (the 13th of April 2020), with 51 more stories having been loaded in this morning:
The last thing to point out is that connecting to the XMLA endpoint (or should I say connecting via the “Analysis Services protocol”…?) for the workspace using SQL Server Management Studio, if it’s in Premium, makes it a lot easier to understand what’s happening behind the scenes here. If you right-click on your table in SQL Server Management Studio’s Object Explorer pane and select Partitions:
…You can see the names of the 51 partitions created to hold the data (remember, above we opted to store 50 days of historical data), the number of rows of data in each partition and the dates that these partitions were last refreshed (or “processed” in Analysis Services terminology):
Notice that the partition for today’s date, April 13th, is empty and that both the partition for today and yesterday (April 12th) have been refreshed today; similarly, the partition for April 11th was last refreshed yesterday on April 12th. The two most recent partitions are always refreshed and this is is why the #”Filtered Rows” step in the M code above only returns rows when the RangeStart parameter holds yesterday’s date, to make sure that the stories for the current day are only stored once.
You can also script out the table to TMSL like so:
This allows you to see the definitions of each partition and, crucially, the start and end values that are passed to the RangeStart and RangeEnd parameters when they are refreshed:
You can download my example pbix file for this post here.
One of the most common errors you’ll see when working with Power Query in Power BI or Excel is this:
Expression.Error: The key didn’t match any rows in the table
It can occur with almost any data source and in a wide variety of different circumstances, and for new users of Power Query it can be very confusing. In this post I’ll explain what the error message means and when you’re likely to encounter it using a simple example.
TL;DR You’re probably getting this error because your Power Query query is trying to connect to a table or worksheet or something in your data source that has been deleted or been renamed.
Say you have an Excel workbook with three worksheets in called Sheet1, Sheet2 and Sheet3:
On Sheet1 there’s some data you want to load into Power BI or Excel using Power Query:
It’s very easy to do this: after you click on the Excel data source and select your Excel workbook, you see the Navigator window with the contents of the workbook and select Sheet1:
…click OK and your query is ready:
What actually happens in this query though? If you click on each of the four steps in the Applied Steps pane on the right-hand side of the screen, you’ll see what each step does. In particular, notice that the first step in the query (called Source) returns a table with one row for each worksheet, table and named range in the workbook:
This is something that happens with many types of data source: after you have connected you have to choose what data inside the data source you want to connect to in the first step, and the list of things you can connect to is returned in a table. This table lists the same things displayed in the Navigator dialog shown above. In the same way, if you connect to a SQL Server database this first step returns a table with all the tables and views in the database; if you connect to an OData feed the first step shows all the resources available through the feed.
The other thing to notice in the table in the screenshot is the Data column, which contains nested table values. If you were to click on one of these nested tables you would navigate to the data in the worksheet named on that row and that is in fact what the second step in the query does.
If you want to understand how the code in the second step of this query gets the contents of the cell containing the nested table, I have a blog post here that goes into a lot more detail. Basically the second step (called Sheet1_Sheet in the code, but shown as Navigation in the Applied Steps pane) returns the nested table in the Data column from the row in the table where the Item column contains the value “Sheet1” and the Kind column contains the value “Sheet”. It’s the contents of the Item and Kind columns that are used to identify the row in the table that contains the data you want to see – in database terminology these columns are the key columns on the table.
What happens if you change the name of the worksheet that contains the data you want, from “Sheet1” to “Hello”?
The result is that the first step of the query now returns a table that looks like this:
Since there isn’t a row where the Item column contains “Sheet1” any more, the second step can no longer find the row it’s looking for – the key value it’s looking for no longer exists in the table – which is why you see the error message you do:
There are two ways to fix the error:
Go to the Excel workbook and change the name of the sheet called “Hello” back to “Sheet1” or
In either the Advanced Editor or the Formula Bar edit the M code and replace the reference to “Sheet1” with “Hello” in the second step of the query:
In summary, when you’re working with data sources in Power Query you need to be aware that if something in your data source is renamed or deleted then your query will break. You can of course write some clever M code to deal with situations like this but my recommendation is to try to fix the problem in the data source and not in your query.