Handling Data Source Errors In Power Query

Recently I was asked an interesting question by Ondra Plánička in the comments of a blog post: how can you handle errors caused by unavailable or missing data sources in Power Query?

Let’s imagine you are loading a csv file like this one into Excel using Power Query:

image

The M query generated by Power Query will be as follows:

let

    Source = Csv.Document(File.Contents("C:\Users\Chris\Documents\SampleData.csv")

                ,null,",",null,1252),

    #"First Row as Header" = Table.PromoteHeaders(Source),

    #"Changed Type" = Table.TransformColumnTypes(#"First Row as Header"

                ,{{"Month", type text}, {"Product", type text}, {"Sales", Int64.Type}})

in

    #"Changed Type"

 

If you load into the Excel Data Model you’ll see the following in the Power Pivot window:

So far so good. But what happens if you try to refresh the query and the csv file is not there any more? The query refreshes but you will see the following in the Power Pivot window:

The structure of the table that has been loaded has changed: instead of three columns you get just one, containing the error message. This wipes any selections in Excel PivotTables that are based on this table; they will need to be recreated when the source file is available once again. Similarly, any relationships between this table and other tables in the Excel Data Model get deleted and have to be added again manually when the source file is there again. Not good.

Here’s how to alter the query so that it handles the error more gracefully:

let

    //This is the original code generated by Power Query

    Source = 

      Csv.Document(File.Contents("C:\Users\Chris\Documents\SampleData.csv"),null,",",null,1252),

    #"First Row as Header" = Table.PromoteHeaders(Source),

    #"Changed Type" = Table.TransformColumnTypes(#"First Row as Header",

      {{"Month", type text}, {"Product", type text}, {"Sales", Int64.Type}}),

    //End of original code

    //Define the alternative table to return in case of error    

    AlternativeOutput=#table(type table [Month=text,Product=text,Sales=Int64.Type],

      {{"Error", "Error", 0}}),

    //Does the Source step return an error?

    TestForError= try Source,

    //If Source returns an error then return the alternative table output

    //else return the value of the #"Changed Type" step

    Output = if TestForError[HasError] then AlternativeOutput else #"Changed Type"

in

    Output

 

While the code from the original query remains intact, the following extra steps have been added:

  • The AlternativeOutput step returns a table (defined using #table) that has exactly the same columns as the csv file. This table has one row containing the text “Error” in the two text columns and 0 in the Sales column.
  • The TestForError step uses a try to see whether the Source step returns an error (for example because the file is missing)
  • The Output step checks to see whether TestForError found an error – if it does, it returns the table defined in the AlternativeOutput step, otherwise it returns the contents of the csv file as returned by the #”Changed Type” step.

Now when you run the query and the csv file is missing, then you see the following in the Power Pivot window:

Because this table has the same structure as the one the query returns when the csv file is present, any PivotTables connected to this table will still retain their selections and the relationships in the Excel Data Model are left intact. This means that when the csv file is back in its proper place everything works again with no extra work required.

You can download the example workbook and csv file here.


Discover more from Chris Webb's BI Blog

Subscribe to get the latest posts to your email.

36 thoughts on “Handling Data Source Errors In Power Query

  1. Fantastic, Chris. 🙂

    I ran into this issue with a client’s file too, but not related to pivot tables. We retrieved the data via Power Query into a table, and then added a column to check the existence of the customer ID in another table. (Seemed like a good idea at the time, but in retrospect should have merged a query into Power Query to check.) Challenge is that when Power Query choked and went to it’s single column view, it removed our calculated column… which was part of some critical logic. I actually ended up dealing with it using VBA, but now I’m wondering why I never thought of this approach! 🙂

    1. Chris,

      Just ran this into a test, and had to make a change to get it to work for me. Specifically this line:
      TestForError= try Source,

      I had to modify it to
      TestForError= try #”Changed Type”,

      Reason for me is that my Source was a web page, and it served up an alternate page if the requested one wasn’t there. That meant that the Source step was still valid, and it wasn’t until I accessed a table on the page and started to format it that it blew up.

      By referencing the last step in the query I was able to provide the AlternateOutput if an error occurred at any point, rather than just in the Source retrieval.

      To me, I’m thinking that the template should be to test the very last step before the AlternateOuput line… unless you know a reason that this isn’t a good idea?

      1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        I think you’re probably right here – putting the test on the last line of what would have been the original code is the safest thing to do.

  2. This is great advice. I’d appreciate your help to extend this line of thinking
    We receive a csv file externally upon which a power query is based
    Sod’s law dictates that sooner or later a row or column will be added to to this even though the understanding is that it will be provided in a constant format by the supplier
    That means the transformation built on top of it will fall over (like any vba would)
    It’s as if I need a function that asks ‘has structure of data file changed from original’ and if so, generate an error to tell me to go and investigate
    I’m all over power query but need to mitigate accordingly because the code is tightly bound to the raw data
    Granted, these situations should be rare but to be forewarned would be forearmed

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Hi Anthony, that particular problem has been on my list of things to research and blog about for a while now – I agree, it’s a common requirement. I’ll make sure I cover this soon.

      1. Future proofing to remove new columns is actually pretty easy. In the last step (or just before the error handling step), select all the columns and choose “Remove Other Columns”. The existing column names get hard coded in M, so anything else that gets added in future will just get removed.

        With regards to rows, that’s a bit more tricky. The only way I can reliably see to do this is to maintain a separate table of items that are deemed “valid” based on one of the columns of data you’re transforming. If you have that, then merging the two tables together while only keeping matching records should allow you to work it out.

      2. Thanks Ken, first job of the day is to clean up my remove columns M code
        I think being notified in some way that the structure of a data file has changed from last and defaulting to an error trap situation would be highly beneficial
        I’ve covered off the files not existing situation by running some VBA to check whether all expected data files exist. If not my mass query refresh is prevented
        I’m also referencing the file names and worksheet names using a parameters table on a sheet for added flexibility. Covers off the renamed worksheet gotcha

  3. Hi

    I have been trying to follow your logic to produce an Alternative Output when this function that sends an address to google maps api returns zero results, but I can’t quite get my head around it!

    The code for the functions is below. An example of the address details that return the error are ABBEY+MILL+HOUSE,+ABBEY+SQUARE+READING.

    Does anyone have any helpful suggestions?

    Many thanks

    Dave

    let

    Address = (address as text) =>

    let

    Source = Json.Document(Web.Contents(“https://maps.googleapis.com/maps/api/geocode/json?address=”&address&”&key=….your api key…”)),

    #”Converted to Table” = Record.ToTable(Source),
    Value = #”Converted to Table”{0}[Value],

    #”Converted to Table1″ = Table.FromList(Value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

    #”Expanded Column1″ = Table.ExpandRecordColumn(#”Converted to Table1″, “Column1”, {“geometry”, “place_id”}, {“Column1.geometry”, “Column1.place_id”}),

    #”Expanded Column1.geometry” = Table.ExpandRecordColumn(#”Expanded Column1″, “Column1.geometry”, {“location”}, {“Column1.geometry.location”}),

    #”Expanded Column1.geometry.location” = Table.ExpandRecordColumn(#”Expanded Column1.geometry”, “Column1.geometry.location”, {“lat”, “lng”}, {“Column1.geometry.location.lat”, “Column1.geometry.location.lng”}),

    #”Renamed Columns” = Table.RenameColumns(#”Expanded Column1.geometry.location”,{{“Column1.geometry.location.lat”, “LAT”}, {“Column1.geometry.location.lng”, “LNG”}, {“Column1.place_id”, “ID”}})

    in

    #”Renamed Columns”

    in
    Address

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Hi Dave,

      Does the try.. otherwise.. code in the following example help you?

      let

      address = “ABBEY MILL HOUSE, ABBEY SQUARE READING”,
      Source = Json.Document(Web.Contents(“https://maps.googleapis.com/maps/api/geocode/json?address=”&address)),

      #”Converted to Table” = Record.ToTable(Source),
      Value = #”Converted to Table”{0}[Value],

      #”Converted to Table1″ = Table.FromList(Value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

      #”Expanded Column1″ = Table.ExpandRecordColumn(#”Converted to Table1″, “Column1”, {“geometry”, “place_id”}, {“Column1.geometry”, “Column1.place_id”}),

      #”Expanded Column1.geometry” = Table.ExpandRecordColumn(#”Expanded Column1″, “Column1.geometry”, {“location”}, {“Column1.geometry.location”}),

      #”Expanded Column1.geometry.location” = Table.ExpandRecordColumn(#”Expanded Column1.geometry”, “Column1.geometry.location”, {“lat”, “lng”}, {“Column1.geometry.location.lat”, “Column1.geometry.location.lng”}),

      #”Renamed Columns” = Table.RenameColumns(#”Expanded Column1.geometry.location”,{{“Column1.geometry.location.lat”, “LAT”}, {“Column1.geometry.location.lng”, “LNG”}, {“Column1.place_id”, “ID”}})

      in

      try #”Renamed Columns” otherwise #table({“LAT”, “LONG”, “ID”},{})

  4. Thank you. Clear naivety: I had no idea about “try … otherwise”! That function feeds into others , so I have just used an expanded version of the #table to make those work too

  5. Instead of defining the alternate outputtable by hand :
    AlternativeOutput=#table(type table [Month=text,Product=text,Sales=Int64.Type],

    Would it be possible to get the definition automatically from the last step?

    1. Hey Frenk,

      Even if we could do that you wouldn’t want to. If the file isn’t found, the previous step would return an error.

      I know who you’re after is a cahched version of the previous step, but it doesn’t work that way. Power Query regenerates the data every update as new columns could be added, etc.

      Personally I’d love a one button method to have power query generate a static line like this from the existing preview to hard code an alternate step. That would be useful for a couple of reasons. (Like preserving a snapshot of what table output was supposed to look like if the data source changed.)

      1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        Yes, it would be incredibly useful to get the definition of a table at design time and have a way of making sure the data source always conformed to that structure. Something to suggest to the dev team?

  6. Hi I try your code and I don’t know while the fucntion return all the time false.

    My code
    /**************************************************************************************
    = (FileName as text) =>
    let
    FileSource = Excel.CurrentWorkbook(){[Name=”Directory”]}[Content]{0}[Directory]& FileName &”.xls” ,
    Test = try FileSource,
    Source = if Test[HasError] then Excel.CurrentWorkbook(){[Name=”Directory”]}[Content]{0}[Directory]& FileName &”.xlsx” else Excel.CurrentWorkbook(){[Name=”Directory”]}[Content]{0}[Directory]& FileName &”.xls”
    in
    Source
    /*********************************************************************************
    I tried with two file. The first have a extension .xls and the second have a extension .xlsx

    Thank you for your help

  7. Hey Chris,

    I’m trying to apply a data source error catch like above..but I think I’m running to an issue with M’s lazy evaluation?

    Example:
    let
    option1 = Folder.Files(“C:\Path\”),
    option2 = Folder.Files(“C:\” & “PathCorrectionText” & “\”),
    results = try option1 otherwise option2
    in
    results

    I’ve tried wrapping the option1 and option2 expressions in other functions – but still no luck getting the query to evaluate correctly. It keeps returning the DataSource.NotFound: File or Folder: We couldn’t find the folder error for option 1 even though option2 path exists (it shows results when clicking on the applied step)

      1. I had not.. but thanks for the direction. So my challenge is to find a function further down the line that forces evalution and call that within the try otherwise statement. Value.Metadata returns nothing from Folder.Files so I’ll have to keep trying and update.

        Thanks Chris

      2. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        Thanks Jeff, that’s good to know!

  8. I’m trying to adapt your code to address a query that pulls in an excel table from another workbook. I get a datasource error if the other workbook is open. I’ve tried an adapted version of your code above but the advanced editor says a Token Eof expected. My knowledge of power query is very limited. What am I doing wrong?

    let
    //This is the original code generated by Power Query
    Source = Excel.Workbook(File.Contents(“C:\Users\MM\OneDrive for Business\Clients\AAA\AAA001-P0003\SOpp Categories.xlsm”), null, true),
    Tbl_Statuses_Table = Source{[Item=”Tbl_Statuses”,Kind=”Table”]}[Data],
    #”Changed Type” = Table.TransformColumnTypes(Tbl_Statuses_Table,{{“Status”, type text}})
    in
    //End of original code
    //Define the alternative table to return in case of error
    AlternativeOutput=#table(type table [Status=text],{{“Error”}}),
    //Does the Source step return an error?
    TestForError= try #”Changed Type”,
    //If Source returns an error then return the alternative table output
    //else return the value of the #”Changed Type” step
    Output = if TestForError[HasError] then AlternativeOutput else #”Changed Type”
    in
    Output

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Hi Mike,

      The first problem is that you have two “in” clauses. Can you try deleting the line that contains just “in” before the line containing the comment “//End of original code”?

  9. This is awesome! Do you know if there is a way to capture a MS SQL timeout error using a similar method? I have been unsuccessful finding this answer in my research which leads me to believe it is not possible. I have a near real-time replicated DB which often has performance issues and would like to default to a different snapshot MS SQL DB if the first query times out (making the resulting dataset at most 24 hours behind rather than whenever the primary DB last happened to be stable). Currently, the SQL timeout error bombs the entire refresh causing no table to update. I tried using the method you outlined above, but it seems the SQL timeout is handled very differently by Power Query. For example, it is capital E Error for DataSource.Error rather than lower case e for error as described in your post, and try[HasError] ends up being FALSE.

  10. Hi Chris, Thanks for your post. Truly interesting. How would you deal with the case of the file not existing and not wanting to load anything( for example when what you load is used for a merge or naotherkind of second step in using that load).Thanks

  11. Hi Chris, hope you can help. I have a similar problem. But I want Power Query to Ignore, or not run the query if the filepath is incorrect. I got the first part where to test the file path has error, and if the file path is correct it works. But if the filepath has an error, I want the query to ignore the error, and simply leave the query blank. How to do that?

Leave a Reply to kenpulsCancel reply