Handling Data Source Errors In Power Query

Recently I was asked an interesting question by Ondra Plánička in the comments of a blog post: how can you handle errors caused by unavailable or missing data sources in Power Query?

Let’s imagine you are loading a csv file like this one into Excel using Power Query:

image

The M query generated by Power Query will be as follows:

let

    Source = Csv.Document(File.Contents("C:\Users\Chris\Documents\SampleData.csv")

                ,null,",",null,1252),

    #"First Row as Header" = Table.PromoteHeaders(Source),

    #"Changed Type" = Table.TransformColumnTypes(#"First Row as Header"

                ,{{"Month", type text}, {"Product", type text}, {"Sales", Int64.Type}})

in

    #"Changed Type"

 

If you load into the Excel Data Model you’ll see the following in the Power Pivot window:

image

So far so good. But what happens if you try to refresh the query and the csv file is not there any more? The query refreshes but you will see the following in the Power Pivot window:

image

The structure of the table that has been loaded has changed: instead of three columns you get just one, containing the error message. This wipes any selections in Excel PivotTables that are based on this table; they will need to be recreated when the source file is available once again. Similarly, any relationships between this table and other tables in the Excel Data Model get deleted and have to be added again manually when the source file is there again. Not good.

Here’s how to alter the query so that it handles the error more gracefully:

let

    //This is the original code generated by Power Query

    Source = 

      Csv.Document(File.Contents("C:\Users\Chris\Documents\SampleData.csv"),null,",",null,1252),

    #"First Row as Header" = Table.PromoteHeaders(Source),

    #"Changed Type" = Table.TransformColumnTypes(#"First Row as Header",

      {{"Month", type text}, {"Product", type text}, {"Sales", Int64.Type}}),

    //End of original code

    //Define the alternative table to return in case of error    

    AlternativeOutput=#table(type table [Month=text,Product=text,Sales=Int64.Type],

      {{"Error", "Error", 0}}),

    //Does the Source step return an error?

    TestForError= try Source,

    //If Source returns an error then return the alternative table output

    //else return the value of the #"Changed Type" step

    Output = if TestForError[HasError] then AlternativeOutput else #"Changed Type"

in

    Output

 

While the code from the original query remains intact, the following extra steps have been added:

  • The AlternativeOutput step returns a table (defined using #table) that has exactly the same columns as the csv file. This table has one row containing the text “Error” in the two text columns and 0 in the Sales column.
  • The TestForError step uses a try to see whether the Source step returns an error (for example because the file is missing)
  • The Output step checks to see whether TestForError found an error – if it does, it returns the table defined in the AlternativeOutput step, otherwise it returns the contents of the csv file as returned by the #”Changed Type” step.

Now when you run the query and the csv file is missing, then you see the following in the Power Pivot window:

image

Because this table has the same structure as the one the query returns when the csv file is present, any PivotTables connected to this table will still retain their selections and the relationships in the Excel Data Model are left intact. This means that when the csv file is back in its proper place everything works again with no extra work required.

You can download the example workbook and csv file here.

23 thoughts on “Handling Data Source Errors In Power Query

  1. Fantastic, Chris. 🙂

    I ran into this issue with a client’s file too, but not related to pivot tables. We retrieved the data via Power Query into a table, and then added a column to check the existence of the customer ID in another table. (Seemed like a good idea at the time, but in retrospect should have merged a query into Power Query to check.) Challenge is that when Power Query choked and went to it’s single column view, it removed our calculated column… which was part of some critical logic. I actually ended up dealing with it using VBA, but now I’m wondering why I never thought of this approach! 🙂

    • Chris,

      Just ran this into a test, and had to make a change to get it to work for me. Specifically this line:
      TestForError= try Source,

      I had to modify it to
      TestForError= try #”Changed Type”,

      Reason for me is that my Source was a web page, and it served up an alternate page if the requested one wasn’t there. That meant that the Source step was still valid, and it wasn’t until I accessed a table on the page and started to format it that it blew up.

      By referencing the last step in the query I was able to provide the AlternateOutput if an error occurred at any point, rather than just in the Source retrieval.

      To me, I’m thinking that the template should be to test the very last step before the AlternateOuput line… unless you know a reason that this isn’t a good idea?

  2. This is great advice. I’d appreciate your help to extend this line of thinking
    We receive a csv file externally upon which a power query is based
    Sod’s law dictates that sooner or later a row or column will be added to to this even though the understanding is that it will be provided in a constant format by the supplier
    That means the transformation built on top of it will fall over (like any vba would)
    It’s as if I need a function that asks ‘has structure of data file changed from original’ and if so, generate an error to tell me to go and investigate
    I’m all over power query but need to mitigate accordingly because the code is tightly bound to the raw data
    Granted, these situations should be rare but to be forewarned would be forearmed

    • Hi Anthony, that particular problem has been on my list of things to research and blog about for a while now – I agree, it’s a common requirement. I’ll make sure I cover this soon.

      • Future proofing to remove new columns is actually pretty easy. In the last step (or just before the error handling step), select all the columns and choose “Remove Other Columns”. The existing column names get hard coded in M, so anything else that gets added in future will just get removed.

        With regards to rows, that’s a bit more tricky. The only way I can reliably see to do this is to maintain a separate table of items that are deemed “valid” based on one of the columns of data you’re transforming. If you have that, then merging the two tables together while only keeping matching records should allow you to work it out.

      • Thanks Ken, first job of the day is to clean up my remove columns M code
        I think being notified in some way that the structure of a data file has changed from last and defaulting to an error trap situation would be highly beneficial
        I’ve covered off the files not existing situation by running some VBA to check whether all expected data files exist. If not my mass query refresh is prevented
        I’m also referencing the file names and worksheet names using a parameters table on a sheet for added flexibility. Covers off the renamed worksheet gotcha

  3. Cracking, been pondering this issue for last few hours then stumbled upon this. Proper Error Handling in M Code, Fantastic!

  4. Hi

    I have been trying to follow your logic to produce an Alternative Output when this function that sends an address to google maps api returns zero results, but I can’t quite get my head around it!

    The code for the functions is below. An example of the address details that return the error are ABBEY+MILL+HOUSE,+ABBEY+SQUARE+READING.

    Does anyone have any helpful suggestions?

    Many thanks

    Dave

    let

    Address = (address as text) =>

    let

    Source = Json.Document(Web.Contents(“https://maps.googleapis.com/maps/api/geocode/json?address=”&address&”&key=….your api key…”)),

    #”Converted to Table” = Record.ToTable(Source),
    Value = #”Converted to Table”{0}[Value],

    #”Converted to Table1″ = Table.FromList(Value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

    #”Expanded Column1″ = Table.ExpandRecordColumn(#”Converted to Table1″, “Column1”, {“geometry”, “place_id”}, {“Column1.geometry”, “Column1.place_id”}),

    #”Expanded Column1.geometry” = Table.ExpandRecordColumn(#”Expanded Column1″, “Column1.geometry”, {“location”}, {“Column1.geometry.location”}),

    #”Expanded Column1.geometry.location” = Table.ExpandRecordColumn(#”Expanded Column1.geometry”, “Column1.geometry.location”, {“lat”, “lng”}, {“Column1.geometry.location.lat”, “Column1.geometry.location.lng”}),

    #”Renamed Columns” = Table.RenameColumns(#”Expanded Column1.geometry.location”,{{“Column1.geometry.location.lat”, “LAT”}, {“Column1.geometry.location.lng”, “LNG”}, {“Column1.place_id”, “ID”}})

    in

    #”Renamed Columns”

    in
    Address

    • Hi Dave,

      Does the try.. otherwise.. code in the following example help you?

      let

      address = “ABBEY MILL HOUSE, ABBEY SQUARE READING”,
      Source = Json.Document(Web.Contents(“https://maps.googleapis.com/maps/api/geocode/json?address=”&address)),

      #”Converted to Table” = Record.ToTable(Source),
      Value = #”Converted to Table”{0}[Value],

      #”Converted to Table1″ = Table.FromList(Value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

      #”Expanded Column1″ = Table.ExpandRecordColumn(#”Converted to Table1″, “Column1”, {“geometry”, “place_id”}, {“Column1.geometry”, “Column1.place_id”}),

      #”Expanded Column1.geometry” = Table.ExpandRecordColumn(#”Expanded Column1″, “Column1.geometry”, {“location”}, {“Column1.geometry.location”}),

      #”Expanded Column1.geometry.location” = Table.ExpandRecordColumn(#”Expanded Column1.geometry”, “Column1.geometry.location”, {“lat”, “lng”}, {“Column1.geometry.location.lat”, “Column1.geometry.location.lng”}),

      #”Renamed Columns” = Table.RenameColumns(#”Expanded Column1.geometry.location”,{{“Column1.geometry.location.lat”, “LAT”}, {“Column1.geometry.location.lng”, “LNG”}, {“Column1.place_id”, “ID”}})

      in

      try #”Renamed Columns” otherwise #table({“LAT”, “LONG”, “ID”},{})

  5. Thank you. Clear naivety: I had no idea about “try … otherwise”! That function feeds into others , so I have just used an expanded version of the #table to make those work too

  6. Instead of defining the alternate outputtable by hand :
    AlternativeOutput=#table(type table [Month=text,Product=text,Sales=Int64.Type],

    Would it be possible to get the definition automatically from the last step?

    • Hey Frenk,

      Even if we could do that you wouldn’t want to. If the file isn’t found, the previous step would return an error.

      I know who you’re after is a cahched version of the previous step, but it doesn’t work that way. Power Query regenerates the data every update as new columns could be added, etc.

      Personally I’d love a one button method to have power query generate a static line like this from the existing preview to hard code an alternate step. That would be useful for a couple of reasons. (Like preserving a snapshot of what table output was supposed to look like if the data source changed.)

      • Yes, it would be incredibly useful to get the definition of a table at design time and have a way of making sure the data source always conformed to that structure. Something to suggest to the dev team?

  7. Hi I try your code and I don’t know while the fucntion return all the time false.

    My code
    /**************************************************************************************
    = (FileName as text) =>
    let
    FileSource = Excel.CurrentWorkbook(){[Name=”Directory”]}[Content]{0}[Directory]& FileName &”.xls” ,
    Test = try FileSource,
    Source = if Test[HasError] then Excel.CurrentWorkbook(){[Name=”Directory”]}[Content]{0}[Directory]& FileName &”.xlsx” else Excel.CurrentWorkbook(){[Name=”Directory”]}[Content]{0}[Directory]& FileName &”.xls”
    in
    Source
    /*********************************************************************************
    I tried with two file. The first have a extension .xls and the second have a extension .xlsx

    Thank you for your help

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s