Troubleshooting Power BI Timeouts, Part 3: The External Command Timeout

In the first post in this series I showed how any Power BI dataset refresh started via the Power BI portal or API is limited to 2 hours in Shared capacity and 5 hours in Premium capacity, and how you could work around that by running a refresh via Premium’s XMLA endpoint feature. In the second post in this series I showed how some M functions allow you to set timeouts. However, even if you initiate a refresh via the XMLA endpoint you may still get a timeout error and in this post I’ll discuss another reason why: the External Command Timeout.

This property is a hangover from Analysis Services (you can see it documented here). It represents the amount of time the Analysis Services engine inside Power BI will wait to get data from a data source. How it behaves exactly depends on the data source: it may limit the amount of time it takes to get the first row of data from the source or it may limit the amount of time it takes to get all the rows of data. In Power BI Premium it is set to five hours, which means that no single partition can take more than about five hours to refresh. In the first post in this series I worked around this by creating a dataset with multiple partitions, each of which took about an hour to refresh, but when trying to refresh a dataset with a single partition that takes more than five hours I got the following error when trying to refresh from SQL Server Management Studio through the XMLA Endpoint:

The error message here is:

Timeout expired. The timeout period elapsed prior to completion of the operation.. The exception was raised by the IDbCommand interface.

In this case I saw the same error in the Error event in Profiler:

…and in other cases, when testing a different source, I got a different error in Profiler in the Progress Report Error event:

The message here is:

Error processing partition ‘<pii>SlowCommand-2114bb81-69d3-4fe4-9d54-6b2661b0c497</pii>’ of table ‘<pii>SlowCommand</pii>’ [Internal Error Code: ‘0xc112001a’].

There’s no way to avoid the External Command Timeout. Instead, what you need to do is either change your partitioning strategy so each partition refreshes in under five hours or tune your data source, M code or gateway (if you’re using one) so that data is returned to Power BI faster.

In Shared capacity I believe the External Command Timeout is set to two hours (again, to match the overall refresh timeout) but it’s much less important there because you can’t create partitions manually (the only way a dataset in Shared can be partitioned is by setting up incremental refresh) and there’s no XMLA Endpoint so there’s no way to work around the two hour overall refresh limit anyway.

[Thanks, as always, to Akshai Mirchandani for a lot of the information in this post]

Troubleshooting Power BI Timeouts, Part 2: Timeouts Specified In Power Query Functions

In the first post in this series I showed how the Power BI Service applies a limit on the total amount of time it takes to refresh a dataset in the Power BI Service, except when you initiate your refresh via an XMLA Endpoint. In this post I’ll look at the various timeouts that can be configured in Power Query functions that are used to access data.

Every time a Power BI Import mode dataset connects to a data source it goes through a Power Query query, and inside the code of that Power Query query will be an M function that connects to a specific type of data source. Most – but not all – of these M functions have the option to set timeouts. The M functions that connect to relational databases (for example Sql.Database which is used to connect to SQL Server-related sources, or Odbc.DataSource which is used to connect to ODBC sources) typically allow you to configure two types of timeout:

  • A connection timeout, which specifies how long to wait when trying to open a connection to the data source
  • A command timeout, which specifies how long the query to get data from the source is allowed to run

Some other functions have other timeouts more appropriate to the data source they access: for example Web.Contents and OData.Feed have a Timeout property for the HTTP requests they make behind the scenes. Other functions (for example Excel.Workbook) have no timeout-related properties that you can set at all. You should check the documentation , either online or using the #shared intrinsic variable in the Power Query Editor, to see what options are available and what the default timeouts are.

Here’s a simple example of how to set a timeout when connecting to SQL Server. First of all, I created a scalar function called ForceDelay in TSQL that returns a value of 1 after a specified number of seconds, so the query:

select [dbo].[ForceDelay](10) as MyColumn

…takes 10 seconds to run.

When you connect to SQL Server in Power BI/Power Query you can paste a SQL query in the connection dialog under Advanced Options; when you do so you’ll also see the option to set the command timeout in minutes:

Here’s the M code generated when you use the SQL query above, set the Command timeout to 1 minute and click OK:

let
  Source = Sql.Database(
    "localhost",
    "AdventureWorksDW2017",
    [
      Query
        = "select [dbo].[ForceDelay](10) as MyColumn",
      CommandTimeout = #duration(
        0,
        0,
        1,
        0
      )
    ]
  )
in
  Source

Notice that the CommandTimeout option has been set on the Sql.Database function, and that the value passed to it is a duration of one minute defined using #duration. Since the SQL query takes 10 seconds to run and the timeout is 1 minute, it runs successfully.

However, if you set the CommandTimeout option to 5 seconds like so:

let
  Source = Sql.Database(
    "localhost", 
    "AdventureWorksDW2017", 
    [
      Query
        = "select [dbo].[ForceDelay](10) as MyColumn", 
      CommandTimeout = #duration(
        0, 
        0, 
        0, 
        5
      )
    ]
  )
in
  Source

…then the query will take longer that the timeout and you’ll see the following error in the Power Query Editor:

The error message is:

DataSource.Error: Microsoft SQL: Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

If you hit the timeout when refreshing in the main Power BI window you’ll see the same message in the refresh dialog box:

So, as you can see, if you are working with large data volumes or slow queries you will need to be aware of the default timeouts set in the M functions you’re using and alter them if need be.

Troubleshooting Power BI Timeouts, Part 1: The Two/Five Hour Import Refresh Limit

If you’re working with a large Power BI dataset and/or a slow data source in Import mode it can be very frustrating to run into timeout errors after you have already waited a long time for a refresh to finish. There are a number of different types of timeout that you might run into, and in this series I’ll look at a few of them and discuss some of the ways you can work around them.

In this post I’ll look at one of the most commonly-encountered timeouts: the limit on the maximum length of time an Import mode dataset refresh can take. As documented here these limits are:

  • Two hours for an import-mode dataset in Shared capacity
  • Five hours for an import-mode dataset in a Premium or PPU capacity

If you are using Premium you can connect to your dataset using SQL Server Profiler and run a trace when the refresh starts and you’ll see the timeout set as a property of the TMSL command that runs the refresh (18000 seconds = 5 hours):

Here’s an example of the message you will see in the Refresh History dialog if your dataset refresh takes more than five hours in PPU:

Once again you get more details in a Profiler trace. When the refresh times out you’ll see an Error event with long message that starts with the following text:

Timeout expired. The timeout period elapsed prior to completion of the operation.. The exception was raised by the IDbCommand interface. The command has been canceled..

There is a way to work around this limit if you’re using Premium or PPU: it only applies to refreshes (scheduled or manual) that you start from the Power BI portal or via the REST API. If you start your refresh by running a TMSL script via the XMLA Endpoint (for example via SQL Server Management Studio or from PowerShell) then the limit does not apply. This is because you’re executing your own TMSL script rather than having the Power BI Service execute its own refresh command – with a timeout set – like the one shown in the screenshot above.

For example, here’s the same dataset from the screenshot above but refreshed successfully from SQL Server Management Studio (notice the Type column says “Via XMLA Endpoint”) and with a duration of just over five hours:

There are a couple of blog posts out there showing how you can implement a refresh strategy using the XMLA Endpoint; this post from Marc Lelijveld and Paulien van Eijk is a great example of how to do this using Azure Automation and Azure Data Factory.

You should also consider tuning any refresh that takes a long time and it could be that after tuning it you fall well within the two/five hour limit. There are a lot of things to consider when tuning dataset refresh; I did a conference session on this topic last year (you can watch the recording here) covering issues like data modelling, query folding in Power Query and the use of dataflows but it doesn’t cover everything and I have learned a lot even since then. In my opinion one of the most important things you can do to improve refresh performance for very large Import datasets is to partition your fact tables and increase the amount of parallelism, as I describe here. One customer I worked with was able to reduce their refresh time from four hours down to one hour using this technique:

As I said, though, there are many different types of timeout that you may encounter – so even if you refresh via the XMLA Endpoint it may still time out for another reason.

%d bloggers like this: