Monitor Power BI Queries And Refreshes With DirectQuery On Log Analytics, Part 4: Query Sessions

In my last post I showed how to use Log Analytics data to analyse Power BI query activity. The problem with looking at a long list of queries, though, is that it can be overwhelming and it can be hard to get a sense of when users were and weren’t actively interacting with a report. In this post I’ll show you how you can write a KQL query that gives you a summary view that solves this problem by grouping queries into sessions.

What is a session? I’m going to define it as a group of DAX queries – sessions – run by the same user that occur within 90 seconds of each other. If a user opens a Power BI report, clicks on a slicer or filter, changes a page or whatever, then so long as each of the DAX queries that are generated in the background when they do this have end times that are no more than 90 seconds apart, then that will count as a single session.

The key to solving this problem is the KQL scan operator. If, like me, you’re the kind of geek that likes data analysis you’re going to love the scan operator! There’s a great blog post explaining what it does here; in summary it allows you to do process mining, ie find sequences of events that match a given pattern. I can think of a few cool things I could do with it but grouping DAX queries into sessions is fairly straightforward. Here’s my KQL query:

| where TimeGenerated > ago(4h)
| where isempty(ExecutingUser)==false 
and ExecutingUser<>"Power BI Service"
and OperationName=="QueryEnd"
| project OperationName, OperationDetailName, TimeGenerated, 
EventText, DurationMs, ExecutingUser
| partition by ExecutingUser
sort by TimeGenerated asc
| scan declare (SessionStartTime:datetime) with
    step x: true => 
        SessionStartTime = 
| summarize 
    by ExecutingUser, SessionStartTime
| extend SessionLength = SessionEndTime - SessionStartTime
| sort by SessionStartTime asc

This is how the query works:

  • Filters the Log Analytics data down to the queries run in the last four hours which were run by a real user and not the Power BI Service
  • Splits this data into separate tables (using the KQL partition operator) for each user
  • Sorts these tables by the TimeGenerated column
  • For each of these tables, add a new column called SessionStartTime that contains the value from the TimeGenerated column and copies it down for each subsequent query if its start time is less than 90 seconds; if a query starts more than 90 seconds after the previous one then SessionStartTime contains that query’s TimeGenerated value again
  • Uses the summarize operator to group the data by the values in SessionStartTime and ExecutingUser

Here’s the output:

This query returns one row per user session where:

  • SessionStartTime is the end time of the first DAX query run in the session
  • SessionEnd is the end time of the last DAX query run in the session
  • QueryCount is the number of DAX queries run in the session
  • SessionLength is the duration of the session

Let me know if you have any ideas for modifying or improving this…

Seventeenth Blog Birthday

Every year, on the anniversary of the first-ever post on this blog, I write a post reflecting on what has happened to me professionally in the past year. In the past this has meant I’ve written about learning some new technology or language (yes, DAX and M were new once), dealing with a business issue back when I had my own company, or more recently adjusting to life as a full-time employee at Microsoft. This year’s new challenge has been becoming a manager for the first time.

One of the great things about working for Microsoft is that, unlike many other companies, you don’t need to become a manager to progress in your career. However, the Power BI CAT team is growing and when I was given the opportunity to manage a part of it I thought, why not? I’m in my late 40s and have never had anyone reporting to me before – in my defence I spent many years working for myself – so I thought it would be good to get out of my comfort zone and try something new. It helped that I have a very supportive manager and that several other people on the team made the transition to management at the same time, which means we’re all learning together. I have particularly enjoyed recruiting new people for my team from inside and outside Microsoft: I have some extremely talented people on my team already, with several more due to start in early 2022.

This doesn’t mean I have left technical things behind though, just that I’m spending less time doing technical things and more time managing other people who do technical things. I think it’s important to stay as technical as I can and to maintain some direct contact with customers in order for me to be an effective manager; again, one of the things I like about Microsoft and my team in particular is that I haven’t had to make a binary choice between being a manager or being technical. That said I have had to accept that, more than before, there are problems I can’t help solve and shiny new things I don’t have time to learn about, and that has been hard.

None of this will affect this blog’s focus on Power BI (I’m certain no-one is interested in my thoughts about management…) or how often I blog but it will accelerate a trend that I suspect has been apparent for the last year or so. The primary motivation for me to blog has always been my own education: writing down information I can’t find anywhere else means it doesn’t get lost and explaining it to other people helps me understand it myself. Now, though, this is pretty much the only reason for me to blog, which means even more of the obscure factoids about Power Query data privacy settings and even less of the click-friendly top ten lists about Power BI/Excel integration type of content. I don’t think many people came here for the introductory tutorials though, did they?

Video: Power BI Data Privacy Settings Deep Dive

Over the past year or so I’ve been delivering a presentation on the Power Query engine’s data privacy settings at various conferences and Power BI user groups, in an attempt to try to pull together all the knowledge I have on this complex topic. Luckily, when I presented this session at the London Power BI User Group recently, they recorded it and posted it on YouTube here:

If you’re struggling with data privacy errors like:

Formula.Firewall: Query ‘Query1’ (step ‘xyz’) is accessing data sources that have privacy levels which cannot be used together. Please rebuild this data combination.


Formula.Firewall: Query ‘Query1’ (step ‘xyz’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.

…in either the Power Query Editor in Power BI or Power Query/Get&Transform in Excel, then I hope this video will help you understand why you’re getting these errors and what you can do to avoid them.

A lot of what I show in this session draws on other material, such as:

  • My five-part series of posts on the Power Query data privacy settings that starts here
  • My post on how credentials and data privacy settings are stored for dynamic data sources here
  • My post here on the performance overhead of applying data privacy checks
  • Ehren von Lehe’s detailed paper on (available here) on how the engine partitions data sources while applying data privacy checks. One of the things I show in the video is that it’s now possible to see these partitions using Power Query Query Diagnostics (see here for some details – although I should probably devote a whole blog post to this in the future)

Lastly, one minor correction to something I said in the video: at the 44:32 mark I do a demo that shows how you can use M functions to avoid a Formula Firewall error. While this trick works in Power BI Desktop and Excel, it results in a dataset that can’t be refreshed in the Power BI Service unfortunately.


Limit The Amount Of Data You Work With In Power BI Desktop Using Parameters And Deployment Pipelines

If you’re working with large amounts of data in Power BI you may find that you have problems because:

  • Your pbix file is very large
  • You spend a long time waiting for refreshes to finish in Power BI Desktop – and if you’re developing, you may need to refresh your dataset frequently
  • It takes a long time to publish your dataset to the Power BI Service

Wouldn’t it be great if there was a way to work with a small subset of your data in Power BI Desktop and then, after you publish, load all the data when you refresh? The good news is that this is now possible with the new deployment pipelines feature in Power BI!

Assuming that you know the basics of how deployment pipelines work (the documentation is very detailed), here’s a simple example of how to do this. Let’s say that you want to use data from the FactInternetSales table in the Adventure Works DW 2017 SQL Server sample database in your dataset. When you import the data from this table and open the Advanced Editor to look at the M code for the query, here’s what you’ll see:

  Source = Sql.Databases("MyServerName"),
  AdventureWorksDW2017 = Source{[Name = "AdventureWorksDW2017"]}[Data],
  dbo_FactInternetSales = 
   AdventureWorksDW2017{[Schema = "dbo", Item = "FactInternetSales"]}[Data]

This query, of course, imports all the data from this table. To cut it down to a smaller size, the first thing to do is to create a new Power Query parameter (called FilterRows here) of data type Decimal Number:


Notice that the Current Value property is set to 5. The purpose of this parameter is to control the number of rows from FactInternetSales that are loaded into the dataset. Here’s an updated version of the Power Query query above that uses this parameter:

  Source = Sql.Databases("MyServerName"),
  AdventureWorksDW2017 = Source{[Name = "AdventureWorksDW2017"]}[Data],
  dbo_FactInternetSales = 
  AdventureWorksDW2017{[Schema = "dbo", Item = "FactInternetSales"]}[Data],
  FilterLogic = 
     FilterRows <= 0 

A new step called FilterLogic has been added at the end of this query that implements the following logic:

  • If the FilterRows parameter is less than or equal to 0 then return all the rows in the FactInternetSales table, otherwise
  • If FilterRows is more than 0 then return that number of rows from the table

Given that the FilterRows parameter is set to 5, this means that the query now returns only the top 5 rows from FactInternetSales:


It’s important to point out that a filter like this will only make your refreshes faster if the Power Query engine is able to apply the filter without reading all the data in the table itself. In this case it can: with a SQL Server data source query folding ensures that the SQL query generated for the refresh only returns the top 5 rows from the FactInternetSales table:


Here’s a simple report with a card that shows the number of rows loaded into the table:


At this point you have your cut-down dataset for development in Power BI Desktop.

Next, publish this dataset and report to a workspace that is assigned to the Development slot in a deployment pipeline and then deploy them to the Test workspace:


Then click the button highlighted in the screenshot above to create a new dataset rule that changes the value of the FilterRows parameter to 0 when the dataset is deployed to the Test workspace:


With this rule in place, when the dataset in the Test workspace is refreshed, the logic in the query above now ensures that all the data from the FactInternetSales table is loaded into the dataset. Instead of just 5 rows, the report now shows that the full 60000 rows of data have been loaded:



Monitoring Power Query Memory Usage With Query Diagnostics In Power BI

In the April release of Power BI Desktop the Power Query Query Diagnostics feature was enhanced so that you can now return performance counter data. As the blog post says:

When you run performance counters, every half second Power Query will take a snapshot of resource utilization. This isn’t useful for very fast queries but can be helpful for queries that use up a lot more resources.

When might this be useful in the real world? In my last blog post I had a chart that shows the amount of data that Power Query reads from disk while loading a large-ish JSON file, created from data collected in Process Monitor using this technique. Here it is again:


The x axis is relative time in seconds from when Power Query started reading the data; the y axis shows the amount of data read. Notice how data is read at a constant rate for the first 1.5 seconds, but that after the 1.5 second mark the throughput flattens? What could be causing this?

Although this does not happen consistently, and I didn’t collect the necessary data when I ran this particular test, the answer is likely to be related to how the Power Query engine uses memory.

This is a topic I have blogged about before and I strongly recommend that before carrying on you read this post on the Container Size property that can be set on dataflows in Power BI Premium. Here’s a quote from Curt Hagenlocher of the Power Query dev team from that post that is relevant here too, about how the Power Query engine uses memory when a query executes:

Certain operations force the rows of a table value to be enumerated. If the enumeration operation itself is expensive, then using Table.Buffer can be a performance optimization because we store the values in memory so that second and subsequent enumerations of the rows go against memory.

If the table is only being enumerated once (which is the most common scenario) or if the underlying enumeration is fast anyway, then Table.Buffer won’t help performance.

Table.Buffer can actually hurt performance in some cases, because we cap RAM usage of the query at 256 MB — which means that a query which uses more than 256 MB is now forced to page RAM to/from disk. Enough paging, and the performance cost can be quite dramatic.

Currently, “table at a time” operations like joins, sort, many groupings, pivot, unpivot, etc., all happen in RAM (unless folded). For large tables, these will consume a lot of memory.

Now it turns out that when Power Query reads a JSON file it has to load the whole file into memory – something that isn’t true for other types of data source. So maybe, while reading the JSON file used in my test, the slow-down in throughput was caused by paging?

With the option to collect performance counters turned on (see the announcement blog post for how to do this) I opened up the pbix file with the query from my last blog post in, went to the Power Query Editor, and collected some query diagnostics data by right-clicking on the last step in my query and selecting Diagnose:

Diagnose Step

It soon became clear that the Power Query engine has a pool of mashup containers that it reuses (yes, go and read that blog post I told you to read!) and it seems like they often need to do some garbage collection before a query runs. So, to get a nice-looking graph, I did something highly unsupported that still seemed to work: I opened Task Manager and killed all the Microsoft Mashup Evaluation Container processes I could see under Power BI. Having done this, when I collected my performance counter data I could build the following graph showing Power Query memory usage while the query was evaluating:


The x axis shows the number of seconds elapsed since the start of the query; the y axis shows the value in bytes for the Commit and Working Set performance counters. The yellow line, for the Commit (bytes) performance counter, shows the amount of virtual memory used by Power Query. The blue line, for the Working Set (bytes) performance counter, shows the amount of physical memory used by Power Query; as you can see it reaches 256MB (indicated by the red dotted line) halfway through and never exceeds that. While Commit is greater than Working Set paging must be happening and Power Query performance may suffer as a result.

Over the years I have found that one of the most effective ways of improving Power Query performance is to try to avoid transformations like those that Curt mentions above that require a large tables to be held in memory – although that’s easier said than done. At least now we have an easy way of seeing where memory might be causing problems for Power Query performance using this new feature.

Speed Up Data Refresh Performance In Power BI Desktop Using Table.View

It can sometimes be frustrating to work with slow data sources or complex Power Query queries in Power BI Desktop: you open the Power Query Editor, make some changes, click Close & Apply and then wait a loooong time for your data to refresh. In this post I’m going to show you a technique that can cut this wait by up to 50%. It involves some fairly complex M code but I promise you, the effort is worth it!

In fact, what I’m going to describe is more or less what I showed towards the end of my appearance on Guy In A Cube last year and in a few other posts, but at that time I didn’t understand properly why it worked or what the performance implications actually were. Now, thanks to a lot of help from Curt Hagenlocher of the Power Query development team I have all the details I need to blog about it.

Let’s see a simple example. Say you have a large JSON file – for this test I generated one that is 67MB containing random data using this handy online tool – and you want to load it into Power BI. You’ll end up with a query that looks something like this:

  Source = Json.Document(File.Contents("C:\generated.json")),
  #"Converted to Table" = Table.FromList(
  #"Expanded Column1" = Table.ExpandRecordColumn(
    #"Converted to Table", 
    {"_id", "index", "guid", "isActive", "balance"}, 
    {"_id", "index", "guid", "isActive", "balance"}
  #"Changed Type" = Table.TransformColumnTypes(
    #"Expanded Column1", 
      {"_id", type text}, 
      {"index", Int64.Type}, 
      {"guid", type text}, 
      {"isActive", type text}, 
      {"balance", type text}
  #"Changed Type"

Here’s what the output of this query looks like:


On my laptop this query takes around 7-8 seconds to load. Using Process Monitor to see how much data is being read from the file (see this post for more details) shows that Power Query is reading the data from this file twice. This graph, generated using data from Process Monitor, has time on the x axis and amount of data read from the file on the y axis, and the two peaks indicate that the data is being read twice:


Why? When you refresh a table in Power BI Desktop two things happen:

  • First, Power BI has to check what columns are present in the table. To do this, it runs the query for the table but asks the Power Query engine to filter the table so it returns zero rows.
  • Secondly the query is run again but this time with no filter, so all rows are returned, and this is when the data is actually loaded into the table in Power BI.

It’s on this first run of the query where problems can occur. If you’re using a data source like SQL Server and you’re not doing any complex transformations then query folding will take place, so Power BI’s request to return the table but with no rows can be handled very efficiently. However, if you’re using a data source where query folding is not possible (such as Excel, CSV or JSON files) or you have complex transformations that stop folding taking place then the only way Power Query can work out what columns the query returns is by running the entire query. That’s what is happening in the example above.

This only happens in Power BI Desktop, though. When you refresh a dataset in the Power BI Service the query is only run once.

The solution is to trick the Power Query engine into running these zero-row filter queries instantly, and you can do that using the Table.View M function. I blogged about Table.View here (and I strongly suggest you read that post before carrying on) but that example didn’t quite get to the solution you need here. Here’s a new version of the query above with an extra step:

  Source = Json.Document(File.Contents("C:\generated.json")),
  #"Converted to Table" = Table.FromList(
  #"Expanded Column1" = Table.ExpandRecordColumn(
    #"Converted to Table", 
    {"_id", "index", "guid", "isActive", "balance"}, 
    {"_id", "index", "guid", "isActive", "balance"}
  #"Changed Type" = Table.TransformColumnTypes(
    #"Expanded Column1", 
      {"_id", type text}, 
      {"index", Int64.Type}, 
      {"guid", type text}, 
      {"isActive", type text}, 
      {"balance", type text}
  OverrideZeroRowFilter = Table.View(
    GetType = () => 
      type table[
      _id = Text.Type, 
      index = Int64.Type, 
      guid = Text.Type, 
      isAction = Text.Type, 
      balance = Text.Type
    GetRows = () => 
      #"Changed Type", 
    OnTake = (count as number) => 
      if count = 0 then 
      type table[
        _id = Text.Type, 
        index = Int64.Type, 
        guid = Text.Type, 
        isAction = Text.Type, 
        balance = Text.Type
     Table.FirstN(#"Changed Type", count)]

The OverrideZeroRowFilter step is where the magic happens. It uses Table.View to override query folding behaviour by intercepting what happens when the table returned by the query is filtered. There are three fields in the record in the second parameter of Table.View that you need to change:

  • GetType returns a table type that describes the columns and their data types present in the output of the query. The six columns listed here are the six columns you can see in the screenshot of the query output above. It’s very easy to generate the required M code when you use the custom function that I blogged about here.
  • GetRows returns all the rows that the query can return, that’s to say the table returned by the #”Changed Type” step.
  • OnTake is used when a top n filter is applied to the table returned by the query.  In this case the code looks at the number of rows that are being requested (specified in the count parameter) and if that’s zero, it just returns an empty table with the same columns as the #”Changed Type” step; if it’s more than zero then it uses Table.FirstN to get the actual number of rows requested from #”Changed Type”. This means that when Power BI does that zero-row filter it can now happen immediately because there’s no need to go back to the data source or execute any of the transformations in the query.

This version of the query now runs in 4-5 seconds, and Process Monitor now shows that the JSON file is only read once and, obviously, reading the data once is a lot faster than reading it twice:


Look at how the graph flattens at the end… but that’s something for another blog post.

There is a downside to this approach: you have to hard-code the table schema that you expect your query to return, and if you change your query to return different columns you’ll have to update the table type in the last step.

I’ve used this technique on other slow queries and it has made a massive difference to the development experience in Power BI Desktop – one query that was taking five minutes to refresh when I closed the Power Query Editor went down to two and a half minutes. If you try this out yourself please let me know if it makes a difference by leaving a comment.

[Don’t forget that there are other things you can do that will also speed up the development experience for slow queries: you should definitely turn off the “Allow data preview to download in the background” option, and you might want to consider turning off data privacy checks so long as you fully understand what the implications]


Power Query Formatter

If you’ve ever used DAX Formatter to format your DAX code and wondered why there isn’t an equivalent for Power Query/M, then wonder no more: the nice people at Power Pivot Insights in Germany (read their German-language blog here) have built one. You can find it here:

PQ Formatter

They also have an API you can use too.

Bonus information: did you know that Microsoft also has an open source parser for M, available here?

Calling The Power BI Export API From Power Automate, Part 3: Creating An Alerting Solution

In my last two posts (see here and here) I showed you how you can call the new endpoints in the Power BI REST API for exporting a report. There are some obvious, extremely useful applications for this such as emailing PDF exports of a report out to large groups of users. In this post, however, I’ll show you how you can use this for something different: creating a Power BI alerting solution.

Now I know what you’re thinking: we already have alerts in Power BI and we can also trigger a Power Automate flow when an alert fires. This functionality is a bit limited though: you can only create an alert on a dashboard, not a report, and then only on some visuals; what’s more you can only use fairly simple rules to trigger an alert. You may need something more sophisticated, but while Power Automate would seem to be the perfect tool for building an alerting solution there’s another problem: it isn’t possible to query a Power BI dataset from Power Automate… until now.

How? The new export endpoints in the Power BI REST API allow you to export a paginated report to a number of different formats, one of which is XML. This means you can write any DAX query you want, use it in a table in a paginated report, export the paginated report to XML in Power Automate and bingo – you have the output of the query in a format that Power Automate can read and do something useful with.

Let’s see an example. Take the following table of data in a Power BI dataset:

Source data

It shows sales for different countries, and let’s say that if the sales value exceeds the threshold given threshold then you want to send an email to the address given in the last column.

The following DAX query filters this table to return all the rows where sales is greater than the threshold:


Query output

It’s quite easy to create a basic paginated report in Power BI Report Builder with just a single tablix to display the output of this query:

Paginated report output

Now, let’s take the Power Automate flow that I described in my last post and alter it slightly.

First of all, instead of exporting to CSV as I did last time, you need to change the action that calls the Export To File endpoint to export the report to XML:

XML export option

More substantial changes are needed at the end of the flow, where the exported report is returned. Here’s what this part of the flow looks like at a high level after the changes:

Overview End

The first action shown here, Download the exported report file, gets the XML returned from Power BI. The SSRS documentation has a lot of detail about how a report gets rendered to XML here, but by keeping the report very basic it’s easy to understand the format of the XML. Here’s what gets returned in this case:

<?xml version="1.0" encoding="UTF-8"?>
<Details Country="UK" Sales="5" 
Threshold="2" Email="" /> 
<Details Country="New Zealand" Sales="6" 
Threshold="1" Email="" /> 
<Details Country="Netherlands" Sales="5" 
Threshold="2" Email="" /> 

The approach I’ve used to consume this XML in Power Automate is basically the one described in this post I found. The Compose action uses an XPath query to return an array containing one item for each row in the query results; writing the XPath query was quite painful but I finally got it working. Here’s the expression from the action:



'/*[local-name() = ''Report'']/*[local-name() = ''Tablix1'']/*[local-name() = ''Details_Collection'']/*[local-name() = ''Details'']')
Next, an Apply to each is used to iterate over each item in this array:
Apply to each


Finally, inside this loop, another Compose action retrieves the email address from the current iteration and this is then used to send an email:

Get email

Here’s the expression used in the Get Email Address action:



'string(/*[local-name() = ''Details'']/@Email)')
And that’s it. It’s a bit of a convoluted workaround, I admit, but it does the job; please also bear in mind the limitations of the export API listed here.
I’m sure there a lots of other things apart from alerting with the ability to consume the output of a DAX query in Power Automate, so if you have any good ideas please let me know in the comments!

Calling The Power BI Export API From Power Automate, Part 2: Creating A Flow That Exports A Paginated Report To A CSV File

In my last post I showed how to create a Power Automate custom connector for the new endpoints in the Power BI REST API for exporting a report to a file. In this post I’ll show you how to use this custom connector in a flow in Power Automate.

The three endpoints in the Power BI API that need to be called to export a report (and which have been included in the custom connector from my last post) are as follows:

  • Export To File In Group must be called first, to start the process of exporting either a Power BI report or a paginated report to a file. Exporting may take some time, though, so the exported file doesn’t get returned at this point. In the headers of the response there is a retry-after value in seconds telling you how long you must wait initially, before…
  • …calling Get Export To File Status In Group to find out whether the export is ready or not. If it isn’t ready you again need to wait the number of seconds specified in the retry-after header before checking the status again. If it is ready, you can then…
  • …call Get File Of Export To File In Group to download the export file.

This is all described in the C# example code here, so the challenge is to translate this into a flow. In my case, I want to export the following paginated report to a CSV file:

Paginated report

Here’s what my flow looks like at the top level:

High level view

To make things easy to understand I’m using a manual trigger to start the flow and hard-coding the report I want to export. The second Action calls Export To File In Group:

Start export

You will need to go into the Settings for this Action and set the Asynchronous Pattern option to Off:


As I said, in the response from this first API call the retry-after header tells you how many seconds you should wait before checking the export status for the first time. This value is used in the next Action, which adds the delay:

First delay

Once this delay has passed there’s a Do loop that will call the Get Export To File Status In Group endpoint up to ten times to check if the export is ready:

Do loop

This call uses the export id found in the response from the call to Export To File In Group:

Check status

If this call returns that the status is Running or NotStarted, then it will again wait for the number of seconds specified in the retry-after header; if not then there is no delay:

Second delay

The expression on the Do loop breaks the loop if the export status is Succeeded or Failed:

equals(body(‘Call_Export_Status_endpoint_to_check_if_export_is_ready’)?[‘status’], ‘Succeeded’),
equals(body(‘Call_Export_Status_endpoint_to_check_if_export_is_ready’)?[‘status’], ‘Failed’)

After the loop, there is a check to see if the export status is Succeeded:

Last condition

If the export status is Succeeded, then the file is downloaded and saved to OneDrive for Business:

Save file

Otherwise, the flow sends a failure notification email:

Failure email

And here’s the exported CSV file viewed in Excel Online:


And that’s it. I know this flow isn’t as robust as it could be, but I hope it gives you an idea of how to use the custom connector to export Power BI reports in Power Automate; as I said last time, I have very little Power Automate experience so please forgive any newbie mistakes! If you can see a way to improve on what I’ve done here, please leave a comment.

Automated exports from a Power BI paginated report to a CSV file is already very useful but there are other fun things we can do with this – and in my next post I’ll show you another, less obvious example.

Calling The Power BI Export API From Power Automate, Part 1: Creating A Custom Connector

Recently, a new set of endpoints in the Power BI REST API for exporting Power BI reports and paginated reports to files went into public preview (see the main announcement here and the paginated reports announcement here). Since there are all kinds of cool things you can do with this I thought I would write a few posts on how to use these endpoints in Power Automate. In this post I’ll talk about setting up a custom connector in Power Automate; in the next post I’ll talk about how to use this custom connector in Power Automate; and after that I’ll show you some less obvious uses for all this.

Before we go any further, and before you get too excited, you should read the limitations of the public preview listed here, especially those around the number of report pages that can be exported per hour and the number of pages in a report that can be exported. Also, this functionality is only available with Power BI Premium or Power BI Embedded.

Creating a Power Automate custom connector for the Power BI REST API is something that several people have blogged about already in detail. Konstantinos Ioannou has a very detailed walkthrough here; Jese Navaranjan has a video walkthrough here; and I blogged about how you could use the Swagger definition of the Power BI REST API to create a custom connector here. I’m not going to go over all these steps again but there are a few specifics that need pointing out if you want to build your custom connector manually. If you’re lazy, I exported my custom connector to a Swagger file and you can download it here; you should be able to import it and create your own custom connector very easily.  I don’t pretend to be a Power Automate expert so please excuse any newbie mistakes!

The three endpoints that you’ll need to use to export to a file (unless your report is in your My Workspace, in which case there are three other equivalent endpoints) are Export To File In Group, Get Export To File Status In Group and Get File Of Export To File In Group. There’s good documentation for regular Power BI reports here and paginated reports here, but in summary Export To File In Group starts the export process but because this might take a long time, doesn’t return the exported file; Get Export To File Status In Group allows you to check the status of the export; and Get File Of Export To File In Group returns the file once the export is ready.

Creating a custom connector in Power Automate is a four-step process and steps 1, 2 and 4 (“General”, “Security” and “Test”) are well covered in the guides above. Step 3 (“Definition”) is where you need to create three Actions for the three endpoints above.

After filling in the General and Security pages, go to the Definition page and click the New action button to create an Action for the Export To File In Group endpoint. You first need to fill in the information in the General section:


Then in the Request section click the Import from sample button, select POST and paste{groupId}/reports/{reportId}/ExportTo

…into the URL box and the sample payload here into the Body box:


Setting up the other two endpoints is similar except that you need to select GET instead of POST and you don’t need to paste anything into the Body box. Here are the two templatised URLs to use for them:

Get Export To File Status In Group:{groupId}/reports/{reportId}/exports/{exportId}

Get File Of Export To File In Group:{groupId}/reports/{reportId}/exports/{exportId}/file

There’s more to do on this page, but at this point you should go to the Test page and test these three new Actions. For Export To File In Group you need to at least enter a groupId, a reportId and a format:


The groupId and reportId can be found by navigating to your report in the Power BI portal; you can extract the groupId and reportId from the URL like so:


The valid values for format (and the other parameter values) are in the docs.

Assuming that it all works, you will be able to scroll down and see the response. Copy all the JSON in the Body section and save it somewhere temporarily.


Do the same thing for the other two Actions you have created; these two actions take a third parameter called exportId, which is the id value in the response that I’ve highlighted in the screenshot immediately above.

Once you have the Body text from Export To File In Group and Export To File Status In Group (the response from Get File Of Export To File In Group should be your report export), go back to the Definitions page and for these two Actions scroll down to the Response section and click on Add default response.


There are two things to do here for both Actions. First, paste the responses you got on the Test pane into Body; next, paste:

retry-after 30

…into the Headers section and then click Import. This will make the output of the Actions much easier to consume in Power Automate later on.

Your custom connector is now ready, and in part 2 of this series I’ll show you how to use it in Power Automate.

[Thanks to Jaime Tarquino and Chris Finlan for their help getting this working]


%d bloggers like this: