New Options For The Table.Buffer Function In Power Query

Chris Webb M, Power BI, Power Query March 20, 2022March 22, 2022 2 Minutes

The March 2022 release of Power BI Desktop includes some new options for the Table.Buffer M function that can be set in a new second parameter. The options are of type BufferMode.Type; if you look at the built-in documentation for this type you’ll see the following:

The two allowed values are:

BufferMode.Eager: The entire value is immediately buffered in memory before continuing
BufferMode.Delayed: The type of the value is computed immediately but its contents are not buffered until data is needed, at which point the entire value is immediately buffered

Anyone with previous experience of Table.Buffer will see that BufferMode.Eager is the existing behaviour, but what is BufferMode.Delayed for?

It turns out that it’s there to make development faster. Consider the following M query that loads data from a CSV file with seven columns and a million rows in, and then uses Table.Buffer to buffer that table into memory:

let
  Source = Csv.Document(
    File.Contents(
      "C:\NumbersMoreColumns.csv"
    ),
    [
      Delimiter  = ",",
      Columns    = 7,
      Encoding   = 65001,
      QuoteStyle = QuoteStyle.None
    ]
  ),
  #"Promoted Headers"
    = Table.PromoteHeaders(
    Source,
    [PromoteAllScalars = true]
  ),
  #"Changed Type"
    = Table.TransformColumnTypes(
    #"Promoted Headers",
    {
      {"A", Int64.Type},
      {"B", Int64.Type},
      {"C", Int64.Type},
      {"D", Int64.Type},
      {"E", Int64.Type},
      {"F", Int64.Type},
      {"G", Int64.Type}
    }
  ),
  BufferTable = Table.Buffer(
    #"Changed Type"
  )
in
  BufferTable

When I refresh this query in Power BI Desktop on my PC I see the “Evaluating…” message for 20 seconds before the data starts to load:

If, however, I add the second parameter [BufferMode = BufferMode.Delayed] to Table.Buffer like so:

let
  Source = Csv.Document(
    File.Contents(
      "C:\NumbersMoreColumns.csv"
    ), 
    [
      Delimiter  = ",", 
      Columns    = 7, 
      Encoding   = 65001, 
      QuoteStyle = QuoteStyle.None
    ]
  ), 
  #"Promoted Headers"
    = Table.PromoteHeaders(
    Source, 
    [PromoteAllScalars = true]
  ), 
  #"Changed Type"
    = Table.TransformColumnTypes(
    #"Promoted Headers", 
    {
      {"A", Int64.Type}, 
      {"B", Int64.Type}, 
      {"C", Int64.Type}, 
      {"D", Int64.Type}, 
      {"E", Int64.Type}, 
      {"F", Int64.Type}, 
      {"G", Int64.Type}
    }
  ), 
  BufferTable = Table.Buffer(
    #"Changed Type", 
    [BufferMode = BufferMode.Delayed]
  )
in
  BufferTable

Then, when I run my refresh, the “Evaluating…” message only appears very briefly before the data starts to load:

It’s important to stress that after the “Evaluating…” phase the data load takes exactly the same amount of time – it’s only the “Evaluating…” phase that is faster. This can save you a lot of time as a developer, nonetheless. I have been told when these options are available in dataflows they will make validation (which occurs when you close Power Query Online after editing a dataflow) much faster too – in fact this developed to partially solve the dataflow validation problem.

[Thanks to Curt Hagenlocher for this information]

Update: something I should make clear is that this functionality is only useful for people who are already using Table.Buffer in their queries. If you’re not using Table.Buffer already then these changes won’t be of any benefit or interest.

Published by Chris Webb

My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel. View all posts by Chris Webb

Published March 20, 2022March 22, 2022

21 thoughts on “New Options For The Table.Buffer Function In Power Query”

David Beavon says:

March 21, 2022 at 12:13 am

Typo?:
The type of the value is computed immediately but its contents are **NOT** buffered until data is needed…

Chris, would you please do a blog sometime about the Azure-managed VNET gateway for Power BI? That team has had their stuff in preview for a whole year and it is unreliable and opaque and impossible to troubleshoot when it is misbehaving (which is often). This public preview was announced March of 2021. Based on what limited visibility I have, and based on the pace of change, I’d guess it will be another year or more before GA.

I suspect that all the customers of this feature are having as much trouble as we are. We have opened a permanent support case, to alert product engineers each time there is a failure, and get their feedback every time. This is necessary because nobody has access to view the gateway logs except the product team (not even the support team). The rate of failures is very high (approx. 1 to 5%, depending on the week and the phase of the moon).

…I wish that Microsoft public previews would be rated somehow, based on the expected GA date and/or based on the level of engineering investment & talent that is being allocated to a given feature. This is a public preview that I would have certainly avoided in retrospect. But after struggling for months and months, I now have a lot of my own skin in the game.

Loading...

Reply
1. Chris Webb says:
  
  March 21, 2022 at 8:01 am
  
  Thanks, I fixed the typo. RE VNet gateways to be honest I don’t think there’s much I can say on the subject apart from that it’s in preview and it needs more work to avoid the kind of problems you describe. Sorry…
  
  Loading...
  
  Reply
2. James Ward says:
  
  March 21, 2022 at 5:27 pm
  
  David, the roadmap shows VNet General Availability in September – https://docs.microsoft.com/en-us/power-platform-release-plan/2022wave1/data-integration/virtual-network-connectivity-power-bi-power-platform-dataflows
  
  Loading...
  
  Reply
Olly Marshall-Moore says:

March 21, 2022 at 8:54 am

Do we know when this function enhancement will become available in dataflows?

(Trying this method in a dataflow this morning results in “2 arguments were passed to a function which expects 1”)

Loading...

Reply
1. Chris Webb says:
  
  March 21, 2022 at 8:56 am
  
  No, but it should be soon I hope
  
  Loading...
  
  Reply
Pingback: Power Query BufferMode Types – Curated SQL
cppmonkey says:

March 21, 2022 at 2:07 pm

1] Are the `*.Buffer()` family of functions able to release memory, if references go out of scope?

For example: If you want to force sorting before merging two tables, where the final merge does not include all records in A — will A still be in memory?

2] If you want to force evaluation, but don’t actually need the full table in memory at once, is there an alternative?

For example: When generating multiple random values, you can end up in a situation where lazy eval causes `Table.AddColumn` to generate a random number once, applied to every row.

I used `Table.AddIndexColumn` to force eval (causing all random numbers) instead of using `*.Buffer()` because I was thinking that

– That it doesn’t have to load the entire table into memory before processing — it allows streaming
– That it doesn’t waste memory by buffering everything

Loading...

Reply
1. Chris Webb says:
  
  March 21, 2022 at 3:34 pm
  
  1) I don’t know and I don’t expect we’d make that level of detail public
  2) I don’t think so…
  
  Loading...
  
  Reply
2. Curt Hagenlocher says:
  
  March 22, 2022 at 3:21 pm
  
  1) It depends. Tables frequently hold references to the tables which were used to produce them, so even if the buffered table itself was no longer directly available it might still be pinned in memory by some other value.
  
  2) In short, there’s no function which is currently guaranteed to do this. Table.AddIndexColumn does achieve this today, and I’ve seen enough people use it that way that we probably won’t ever make it a foldable function for risk of breaking the behavior of existing queries. FWIW, random numbers should be fixed now so that they don’t do weird things with Table.AddColumn.
  
  Loading...
  
  Reply
brent smith says:

March 21, 2022 at 3:03 pm

Hi Chris, this actually makes me think – I have wrapped a Table.Buffer() around my sorting steps to force the step before subsequent steps … but maybe you know of a better approach?

Loading...

Reply
Mark says:

March 21, 2022 at 10:59 pm

I’ve been using Table.Buffer for a long time and I’m happy to hear it will make it’s way to dataflows to make validation quicker.

What I don’t understand is what makes the new type faster? Both types are buffered into memory but at different times – one is immediately and the other when the data is needed, which would be at refresh for the latter, correct? I don’t understand how they affect the evaluation time differently when loading. If anything, wouldn’t the Eager type evaluate quicker?

Loading...

Reply
1. Chris Webb says:
  
  March 22, 2022 at 1:07 pm
  
  What makes it faster is that it allows the buffering to only take place when the dataset or dataflow is actually refreshing, and not when various metadata checks (ie the stuff that happens when the “Evaluating” message appears in Desktop, or the “Validating” phase when you close a dataflow) happen before a refresh takes place when you’re developing.
  
  Loading...
  
  Reply
sam says:

March 23, 2022 at 5:52 am

Instead of all this why not just have a Manual mode of Calculation similar to Power Pivot or Excel which can be switched on during development which forces PQ to work with just the first 1000 rows in the preview.

Loading...

Reply
Pingback: Chris Webb's BI Blog: I’m Posting On The Power Query Blog Too! Chris Webb's BI Blog
CppMonkey says:

September 13, 2022 at 2:50 pm

Yes,There’s a new function named:Table.StopFolding
Chris posted an article about it on his other blog:

https://powerquery.microsoft.com/en-us/blog/stop-query-folding-with-table-stopfolding/

Loading...

Reply
Pingback: How to Improve Power BI with SharePoint Online Performance - MS Wiki Magazine
Kenneth Barber says:

June 27, 2023 at 5:06 am

Assuming that you are already using Table.Buffer, is there ever any time when you would want to use BufferMode.Eager instead of BufferMode.Delayed? If not, then why not just change Table.Buffer to always use BufferMode.Delayed instead of allowing the choice between the 2 buffer modes?

Loading...

Reply
1. Kenneth Barber says:
  
  February 20, 2024 at 5:56 pm
  
  I found an answer to my own question! I tried to replicate the technique described in https://blog.crossjoin.co.uk/2014/11/17/timing-power-query-queries/. However, Table.RowCount did not force the entire rest of the query to evaluate before the end time was evaluated. I then tried Table.Buffer with BufferMode.Delayed and it also did not work. However, Table.Buffer with BufferMode.Eager did work!
  
  I still wonder in what other cases BufferMode.Eager is preferred over BufferMode.Delayed.
  
  Loading...
  
  Reply
Pär Adeen says:

January 30, 2024 at 6:48 pm

Looks like this is now implemented in Power Query online as well

Loading...

Reply
Sergio Castro says:

June 3, 2024 at 4:40 pm

Hi,
Thanks for this post. It’s been very useful using this new parameter for development. However, when should we use this parameter over the Eager?. I mean, Chris mentioned in a comment that this parameter alows skiping “various metadata checks”. In what cases we would want to skip metadata refresh? I mean, if there’s a metadata change, shouldn’t we always allow to refresh it? What would happen if we decide to use always the new parameter because it allows improving evalutating times?
Thanks!

Loading...

Reply
1. Chris Webb says:
  
  June 4, 2024 at 12:42 pm
  
  You should use it in situations where you’re sure the schema of the table you’re buffering is never going to change. I suspect you’ll get errors if you use it and the schema does change.
  
  Loading...
  
  Reply