Returning The Contents Of Any Table In Any Excel Workbook In Sharepoint With Power Pivot And Power Query

Yet another Power Query post this week (with more to come) – today I’m going to show you how you can use Power Pivot and Power Query to build a dynamic solution that allows a user to browse through all the Excel workbooks stored in a Sharepoint document library and see the contents of any table in any workbook they choose. Why would you want to do this? Well, I’m not really sure, but I visited a customer a few months ago where an analyst (a data steward?!) had created a repository of Excel workbooks containing various datasets used for BI in Sharepoint. Maybe it could be useful to use Power Query to browse these datasets… and even if it isn’t, it’s still a learning experience for me 🙂

The first step is to build a Power Query query that returns a list of all of the Excel workbooks in a Sharepoint document library. There are a couple of ways of doing this; you could use the SharePoint.Contents() function, but I opted to use the OData REST api that Sharepoint exposes because it was faster and easier to work with. The following Power Query query is all that’s needed to get a list of files in a library called MetadataTest:

let

    Source = OData.Feed("https://zzzz.sharepoint.com/_vti_bin/listdata.svc/MetadataTest"),

    RemovedOtherColumns = Table.SelectColumns(Source,{"Id", "ContentType", "Path", "Name"})

in

    RemovedOtherColumns

 

image

These all happen to be Excel workbooks, which makes things easier for me.

The next thing to do is to get a list of all the tables in any of these Excel workbooks. Again, you can use OData for this. Here’s an example query that requests the $metadata of the OData endpoint that an Excel workbook called FruitVegPeople.xlsx exposes, interprets the response as an XML document, and finds the list of tables in the workbook from that XML document (this last step accounts for 90% of the code below):

let

    Source = Xml.Document(

    OData.Feed(

    "https://zzzz.sharepoint.com/

    _vti_bin/ExcelRest.aspx/MetadataTest/FruitVegPeople.xlsx/OData/$metadata"

    )),

    Value = Source{0}[Value],

    Value1 = Value{0}[Value],

    Value2 = Value1{0}[Value],

    Value3 = Value2{3}[Value],

    #"Expand Value" = Table.ExpandTableColumn(Value3, "Value"

        , {"Name", "Namespace", "Value", "Attributes"}

        , {"Value.Name", "Value.Namespace", "Value.Value", "Value.Attributes"}),

    #"Expand Attributes" = Table.ExpandTableColumn(#"Expand Value", "Attributes"

        , {"Name", "Namespace", "Value"}

        , {"Attributes.Name", "Attributes.Namespace", "Attributes.Value"}),

    FilteredRows = Table.SelectRows(#"Expand Attributes", each ([Attributes.Name] = "Name")),

    RemovedOtherColumns = Table.SelectColumns(FilteredRows,{"Attributes.Value"}),

    RenamedColumns = Table.RenameColumns(RemovedOtherColumns,{{"Attributes.Value", "TableName"}})

in

    RenamedColumns

 

In this case, there are three tables called Fruit, Vegetable and Person returned:

image

At this point you can find a list of files in a library and find a list of tables in an Excel workbook; what you really need is a single list of all the tables in all of the files. This can be accomplished by turning the second query above into a function, and calling that function for each file returned by the first query. Here’s the function:

let

    Source = (SharepointRootURL, ExcelSourceDoc) => 

 

let

    Source = 

    Xml.Document(

    OData.Feed(SharepointRootURL & "/_vti_bin/ExcelRest.aspx" & ExcelSourceDoc & "/OData/$metadata"

    )

    ),

    Value = Source{0}[Value],

    Value1 = Value{0}[Value],

    Value2 = Value1{0}[Value],

    Value3 = Value2{3}[Value],

    #"Expand Value" = Table.ExpandTableColumn(Value3, "Value", {"Name", "Namespace"

    , "Value", "Attributes"}, {"Value.Name", "Value.Namespace", "Value.Value", "Value.Attributes"}),

    #"Expand Attributes" = Table.ExpandTableColumn(#"Expand Value", "Attributes"

    , {"Name", "Namespace", "Value"}, {"Attributes.Name", "Attributes.Namespace", "Attributes.Value"}),

    FilteredRows = Table.SelectRows(#"Expand Attributes", each ([Attributes.Name] = "Name")),

    RemovedOtherColumns = Table.SelectColumns(FilteredRows,{"Attributes.Value"}),

    RenamedColumns = Table.RenameColumns(RemovedOtherColumns,{{"Attributes.Value", "TableName"}})

in

    RenamedColumns

 

in

    Source

 

And here it is used:

let

    SharepointRootURL = "https://zzzz.sharepoint.com",

    Source = OData.Feed(SharepointRootURL & "/_vti_bin/listdata.svc/MetadataTest"),

    RemovedOtherColumns = Table.SelectColumns(Source,{"Id", "ContentType", "Path", "Name"}),

    InsertedCustom = Table.AddColumn(RemovedOtherColumns, "Custom"

    , each GetTablesFunction(SharepointRootURL, [Path] & "/" & [Name] )),

    #"Expand Custom" = Table.ExpandTableColumn(InsertedCustom, "Custom", {"TableName"}, {"TableName"}),

    InsertedCustom1 = Table.AddColumn(#"Expand Custom", "TableODataURL", each SharepointRootURL 

    & "/_vti_bin/ExcelRest.aspx" & [Path] & "/"  & [Name] & "/OData/" & [TableName]),

    RemovedColumns = Table.RemoveColumns(InsertedCustom1,{"ContentType", "Id"})

in

    RemovedColumns

 

image

This gives you the list of workbooks and tables you need, and this table itself can now be loaded into the Excel Data Model. You can now build a report using this data something like this:

image

The slicers allow the user to select a folder, a workbook and a table in a workbook. In the top right hand corner is a PivotTable displaying a measure with the following definition:

TableODataURLMeasure:=

IF(HASONEVALUE(GetDocumentsTables[TableODataURL]), VALUES(GetDocumentsTables[TableODataURL]))

This returns the URL containing the OData feed for the contents of the selected table. The PivotTable is a bit of a red herring, in fact – I only created it so the slicer highlighting would work properly, a trick I use quite frequently when I’m working with Excel cube functions (the PivotTable itself can be hidden from the end user). In the bottom right hand corner, in the cell selected in the screenshot above, is a CubeValue() function that returns the same measure value and I’ve put that cell into an Excel table – which means that we can now use this measure value as an input to yet another Power Query query.

Two last bits of code: first, a Power Query function that will return the contents of any OData feed passed to it.

let

    Source = (ODataTableURL) => 

 

let

    Source = OData.Feed(ODataTableURL),

    RemovedColumns = Table.RemoveColumns(Source,{"excelUpdated", "excelRowID"})

in

    RemovedColumns

 

in

    Source

 

Finally, a query that takes this function and calls it for the URL selected by the user above:

let

    Source = GetTableContentsFunction(Excel.CurrentWorkbook(){[Name="TableODataURL"]}[Content]{0}[URL])

in

    Source

 

And there you have it – a way of selecting any table in any Excel workbook in a Sharepoint library and seeing its contents. You can download the sample workbook here, although of course you’ll have to modify all the URLs to make it work on. 

PS Is it just me, or does “Power Query query” seem like a bit of a mouthful? Would it be ok just to talk about queries in future? That seems a bit misleading… maybe I should talk about “a Power Query”? That doesn’t sound right either. Hmmm.

11 thoughts on “Returning The Contents Of Any Table In Any Excel Workbook In Sharepoint With Power Pivot And Power Query

    1. I am confronted with are very similar requirement which requires consolidating several sheets in several workbooks stored in a SharePoint library. Unfortunately, it is a SharePoint 2007 site, and the Odata service does not seem to be available in this version. I had to use the “export to spreadsheet” feature of the library to get the list of files.
      But a very instructive post anyway. Thanks.

  1. Hi, Chris. Is it possible to return an Excel Table from another workbook that is not on Sharepoint? Obviously you can return a Table from the current workbook (Excel.CurrentWorkbook) and you can return sheets from another workbook (Excel.Workbook), which you can then whittle down to a table that matches your original table. And now I see that it can be done through odata with Sharepoint. But it’s not intuitive how to do it with a standalone, outside workbook.

    Thanks for all the great work you’re doing with Power Query.

  2. Hi Chris, really helpfull post, thanks! I’ve bumped into a similar problem, and your work helped me a lot to create my solution. I’m no expert in programming and naturally my case had its own nuances, but I have managed to overcome them so far. However, I’m now stuck in a complication I’m not finding any clue on the internet to solve it and I would be really gratefull if you could shed some light. The line Value3 = Value2{3}[Value], in the GetTablesFunction is only applicable if there are always 3 tables in the workbooks one is looping through. In my case that number can vary. Most of the excel files I am dealing with have just one table, but there are a few that have more. I’m wondering if there’s any property of Value2 that retrieves the # of elements of that array, therefore making it possible to write something like len = Value2.Length, Value3 = Value2{len}[Value]. Note: each file has only 1 table/sheet, so those that have 3, have also 3 sheets. I’m not sure if this is relevant, but I think it’s worth mentioning. Thank you (and sorry for the long comment)

    1. Hi Joao, the Value3 line isn’t referring to the number of tables in the workbook – it’s referring to the part of the XML in the response from the web service that contains the list of tables.

      1. Chris, thank you very much for the quick response. Ok I get that. However, the issue is that if I write Value2{3}[Value], the query breaks and shows the error: “There were not enough elements in the enumeration to complete the operation.”. If I write Value2{1}[Value] then the query puts the actual table names in the TableName column for workbooks with a single table (hence 1 sheet), but for workbooks with 3 tables or 5 tables (1/sheet), that same TableName column contains the column names of the tables (notice it does not specify from which table those columns are, the tables have all the same column names. I’ve tested to change a column name and it just alters in the query when I’ve changed in the majority of the tables…curious). So, for single sheet single table wb, number 1 refers to the part of the XML that contains the list of tables, but I’m guessing that for a wbs with >1 tables that number refers to the part of the XML that contains the column names… is there any property that retrieves the proper index for the table names for all situations?

Leave a Reply to JoãoCancel reply