Why Is Power BI Running My SQL Query Twice?

When you import data from a relational database like SQL Server in Power BI you have the option of entering your own SQL query to use as a starting point:

NativeSQL

Here’s the M code for a query that does this:

If you’re confident writing SQL this might seem like a good option, but as I said in this blog post it has the side-effect of disabling query folding inside the Power Query query, so if you add any other transformations they will always be performed inside the Power Query engine – which may be less efficient than performing them in the data source.


let
Source=Sql.Database(
"localhost",
"AdventureWorksDW2017",
[Query
="SELECT [DateKey]#(lf) ,[FullDateAlternateKey]#(lf) ,
[DayNumberOfWeek]#(lf) ,[EnglishDayNameOfWeek]#(lf)
FROM [AdventureWorksDW2017].[dbo].[DimDate]"]
)
in
Source

There’s also another drawback: when you refresh your dataset in Power BI Desktop (although not in the Power BI Service) you’ll see that your SQL query is run twice. Here’s the evidence from SQL Server Profiler showing what happens when the query above is refreshed in Power BI Desktop:

If your query is slow, or if each query execution costs you money, then this is something you want to avoid.

Why is this happening? It turns out this is just another example of what I blogged about here: Power BI wants to know the schema of the table before the query actually runs, so it asks Power Query to return the top 0 rows. Unfortunately, in this case query folding can’t take place and the top 0 filter can’t be pushed back to the database, so the entire query gets run once to get the schema and once to get the data.

The solution is the same as the blog post I just mentioned too: use the Table.View M function to hard-code the schema returned by the query and implement query folding manually. Here’s the adapted version of the new query:

let
Source = Sql.Database(
"localhost",
"AdventureWorksDW2017",
[Query
= "SELECT [DateKey]#(lf) ,[FullDateAlternateKey]#(lf) ,
[DayNumberOfWeek]#(lf) ,[EnglishDayNameOfWeek]#(lf)
FROM [AdventureWorksDW2017].[dbo].[DimDate]"]
),
OverrideZeroRowFilter = Table.View(
null,
[GetType = () => type table[
DateKey = Int32.Type,
FullDateAlternateKey = DateTime.Type,
DayNumberOfWeek = Byte.Type,
EnglishDayNameOfWeek = Text.Type
], GetRows = () => Source, OnTake
= (count as number) =>
if count = 0
then #table(GetType(), {})
else Table.FirstN(Source, count)]
)
in
OverrideZeroRowFilter

Generally speaking, I think there’s a lot to be said for creating views (if possible) instead of embedding your own SQL into a Power BI dataset – it makes maintenance and tuning much easier, and of course if you can connect straight to the view without writing any SQL in Power BI, then query folding will work and Power BI Desktop will only query the view once when you refresh.

UPDATE: there’s now a much better way of solving this problem, which involves using Value.NativeQuery function and the EnableFolding=true option https://blog.crossjoin.co.uk/2021/02/21/query-folding-on-sql-queries-in-power-query-using-value-nativequery-and-enablefoldingtrue/

32 thoughts on “Why Is Power BI Running My SQL Query Twice?

  1. Thanks Chris, but I have question though. Does the same behaviour occur when using you own MDX query in the AnalysisServices.Database function?

  2. So if this is just a problem on the desktop why should I care? It would only impact the developer, not the report consumers that access it in the service, right?

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Yes, but it can make your life as a developer harder and as I said, some data sources charge you per query.

  3. Thank you.
    I was using a SQL sentence to reduce the data to import by filtering by model parameter (year). I thought it was more efficient than filtering in PowerQuery. Now I see it maybe wasn’t the best approach…

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      No, as I said in the post, it doesn’t happen when you refresh in the Power BI Service.

  4. Are you able to go over in more details the steps with that function to extract the fields in a query so that it can be plugged in? I tried to follow your directions but was not quite getting it. Our company uses almost exclusively SQL queries so that would be a great help Thank you!

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      You need to take the M codein this blog post https://blog.crossjoin.co.uk/2018/10/03/function-m-table-type/ and then paste it into a new blank query. This will give you a function; you can then pass the name of the query whose data types you want into this function.

      1. So I paste that code in and then afterwords right click on the query and say “convert to function”. When I do that there is a dialogue box that pops up that says “The query does not reference any parameters. Are you sure you want to create the function without any parameters. If I say OK there, I don’t really see how to get the function to work to pass the name in. That’s where I’m getting lost. Do you happen to have a sample small PBIX file by chance?

      2. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        You don’t need to right click and “convert to function” – the code is already a function, so when you paste it into a blank query you’ll have a function ready to use. When you click on the new query that returns the function you should see a dropdown box and be able to select any other query to pass in, and when you do this and invoke the function it will create yet another query that returns the results.

      3. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        I’ll try to create a pbix file with an example in and post it here within the next day or so.

      4. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        Sorry for the delay, you can get a sample pbix here: https://1drv.ms/u/s!AppIOcLysIR7gZgCJCs7vmAsB1pNRA?e=XOURY7 There are three queries here: the function to get types, a sample query that gets a table from a public data source, and a query that calls the function on this sample table and returns the data types.

  5. OK…I appreciate you getting a sample file. That would be amazing. I don’t see the dropdown box you’re talking about after creating the query but I’ll keep digging. I just don’t get what I’m missing. Thanks again.

  6. Hello Chris,

    I tried to implement what you present. But I am having problems with null values. I get the error “The type of the value does not match the type of the column” in many rows. I used Text.Type and Decimal.Type. How do you manage null values ?

    Thank you.

  7. Hi Chris,

    I have just tried your examples and regarding the first example I have discovered the following:

    Even though Power BI Desktop starts the statement twice, it seems that the first statement is more a “tracer bullet” to make sure that the schema is correct, as the first statement is aborted. In SQL Profiler I included the EventClass SQL:BatchCompleted and the Column Error and it reveals that the first statement returns the error code “2 – Abort”.

    Then afterwards the second statement is executed and it completes with an “0 – OK” and the actual rows are returned.

    So it looks like Power BI Desktop somehow breaks the connection for the first statement. This of course have a price, but not as high as if the result set were returned twice.

    By the way I am using the June 2020 Power BI Desktop – 2.82.5858.1161

    – Your blog posts is very deep and interesting 🙂

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Thanks! I didn’t catch that. I do know that for other data sources this is a much bigger problem, I guess because Power BI isn’t able to abort queries in the same way.

  8. Here’s a direct proof of Chris’ assertion.

    Create a table on SQL Server:

    CREATE TABLE [dbo].[test](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [timestamp] [datetime2](7) NULL,
    [machine] [nvarchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
    CONSTRAINT [PK_test] PRIMARY KEY CLUSTERED
    (
    [id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]

    Run this Power Query code:

    Sql.Database(“server”, “database”, [Query=”insert test (machine) select ‘this one’#(lf)select * from test”, CreateNavigationProperties=false])

    This will insert two rows each time.

  9. Hi, Chris. I tried to do this in a Oracle connection but it didn’t work, actually sometime it works but not always. Could you help me understand why? Another thing is that my sql take 5 min just to connect, what can it be?

  10. Dear Chris,

    I have some Power Queries embedded in Excel sheets which I use to do updates and inserts on a Microsoft SQL db. Today I faced an issue with the insert, where the SQL server was complaining that I was attempting to write two records with the same primary key, and I immediately realized that it was Power Query that was running twice my native query against the db.

    A few months ago I did not encounter this issue, and the Excel sheet (with the embedded query) was exactly the same. However, I was using Excel 2016, whereas recently my company switched to Excel 365. Are you aware of any change in the folding algorithm between these two versions of Excel?

    Thanks in advance and best regards,

    Enrico Menotti

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Using Power Query to do inserts is not supported (I know I blogged about it once, but I think I was clear that you shouldn’t do it in the real world) precisely because of this issue.

  11. Right, I know it’s not supported, and now that you tell me I remember that the first time I googled about Power Query SQL updates I stepped upon that old post of yours. It was clear there’s this double-run issue, and also there’s a similar issue with previews, which run any query just when you move your mouse pointer over the query name in the “Query and connections” tab at the right of the Excel sheet. Not to talk about any edit in the editor and advanced editor, which run the query immediately.

    However, doing an update two times with the same value cannot do any harm to the db, as far as I know (please correct me if I’m wrong). What worries me is instead the chance of updating a field while the record is locked by another user from the ERP GUI; this is something I’ve never investigated on.

    Based on these considerations, I have successfully used Power Query for SQL updates, where in any case the use is just for non-ordinary mass modifications which cannot be done from our ERP GUI, and I’m the only user who does this. Whenever I do such things, I always first test thoroughly on a single record, and only after that go for mass modifications, and at the end always check thoroughly the results. Since my data always come from Excel spreadsheets, this technique is pretty useful and fast, and has the advantage of being quite minimal (it’s based on Excel’s built-in features only). Maybe one may try even with the old Microsoft Query which appears to be still at the basis of Excel external connections, but the M language provides nice functions for preparing the data to be sent to the SQL server: I use M to take care of localizations (the most tough thing to do, I have to say) and prepare a proper query text, so that when I run it the server is asked to create a temporary table (in an @TEMP variable) with the data, and this then serves as the source for the update statement.

    In any case, I’d think very well before even considering a chance to go to live production environments, for daily use by other people I mean, with such a technique. Too fragile, actually.

    For what concerns the insert statements, after many months of doing updates this way without encountering any minimal issue, I decided to try with inserts as well. Here what takes care of avoiding duplicates is the primary key, which cannot be duplicated. So the worst that may happen is an error message. And again, first I test with a single record, and at the end check all the results.

    Usually I don’t like “quick and dirty” things, but this is a case where being quick and dirty saves a lot of man hours, provided the end results are clean.

    Anyway, my question remains: why does Excel 365 raise an error where Excel 2016 doesn’t? And, just to close this comment, let me say that your suggestion to use table.view works great in removing the Excel 365 error. I go this way: by M, I prepare a string “Data” for asking SQL to prepare the @TEMP variable, and then do

    Dest = Sql.Database(“…”, “…”, [Query = ”

    declare @TEMP table
    (
    … [table structure]
    );
    insert into @TEMP values ” & Data & “;

    insert into … [name of the db table] select * from @TEMP;

    select * from @TEMP [this is just to see what has been actually sent to SQL]

    “])

    in

    Dest

    The way I have been able to remove the error has been by substituting the final “Dest” by:

    Table.View(null, [GetType = () => Value.Type(GetRows()), GetRows = () => Dest])

  12. This is astonishing. I modified my query in VBA for a power pivot table that returns 800k rows. The query takes about a minute to refresh. While watching the activity monitor on my SQL Server I watched Active Expensive Queries pop twice, which lead me to this blog. There are a bunch of columns so Table.View doesn’t seem viable at this point (perhaps it will be). Since my SQL executes a stored procedure, I fed the current user name in as a parameter then I check a table when the SP is executed to see if this is the first or second time the SP was executed by the user. If first, it returns only the schema from a table variable. If the second time, it populates the full table variable and returns the results. Kind of a nasty workaround and I don’t know if it will hold up. I hope there’s some sort of solution to the Top 0 issue.

    1. My workaround didn’t end up working because SOMETIMES the query is called twice, but sometimes it isn’t. It appears that when the query stays EXACTLY the same those two calls aren’t made. However, if you change a PARAMETER of the query it is executed twice. I noticed that in Excel Power Pivot that your code does not appear to fix the issue.

      I wrote an INSERT into my stored procedure to insert into a log table for every execution. Two entries were made when the parameters of my query were significantly changed.

      Following code works “sometimes”:

      = let
      JCCo = if #”ctJCCo”=null then “2” else #”ctJCCo”,
      Mth = if #”ctMth”=null then “6/1/2022″ else #”ctMth”,
      Job = if #”ctJob”=null then “null” else #”ctJob”,

      Source = Sql.Database(“MYONPREMISESQL”, “MYDATABASE”, [Query=”
      EXEC dbo.MyPowerQueryReport_PQ @JCCo=” & JCCo & “, @Mth=N'” & Mth & “‘, @Contract='” & Job & “‘
      “]),
      OverrideZeroRowFilter = Table.View(
      null,
      [GetType = () => type table[
      JCCo = Int64.Type,
      Job = Text.Type,
      JobDesc = Text.Type,
      Item = Text.Type,
      JobStatus = Int64.Type
      ], GetRows = () => Source, OnTake
      = (count as number) =>
      if count = 0
      then #table(GetType(),{})
      else Table.FirstN(Source,count)]
      )
      in
      OverrideZeroRowFilter

      This is awful. Power Pivot seems to be an absolute mess. I do all the work finding, gathering, prepping the data. I can write DAX queries to manipulate it. But MS can’t do the simple job of calling a query ONLY ONCE or at least providing some sort of value that I can use as a parameter to notify my query that they only want the schema on “this call”.

  13. Hi Chris, thank you for this, it is amazing, I spend quite a long time googling for this. I am loading my data from Oracle database, and the described fix works well, but then in power query I need to append an excel document (there are manual adjustments and no possible workaround at this moment) and after the append the query is loading twice again, taking horrible long time. Do you have any experience with appending SQL and excel and avoiding the double loading? Many thanks, Olga

Leave a Reply to mmichaels1970Cancel reply