Configuring Power BI Gateway Data Sources For Files And Folders

Recently I’ve been building a lot of Power BI reports from csv and Excel files, and to make sure that scheduled refresh works I have been setting up data sources in an On Premises Data Gateway (what used to be called the Enterprise Gateway). I had assumed that if I was connecting to file-based data sources in my Power BI dataset then, in the gateway, I would need to set up one data source for each file that I’m connecting to – which is a bit of a pain. In fact it turns out that you can set up a gateway data source for the folder that the files are in instead.

Let me give you an example. Imagine that you have three Excel files in a folder called C:\Sales Data:

image

Now imagine that you have three queries in Power BI that get data from these three files:

Here’s an example of the M code for one of these queries:

[sourcecode language=”text” padlinenumbers=”true”]
let
Source =
Excel.Workbook(
File.Contents("C:\Sales Data\SalesData_1.xlsx")
, null, true),
SalesDataTable_Table =
Source{[Item="SalesDataTable",Kind="Table"]}[Data]
in
SalesDataTable_Table
[/sourcecode]

There’s nothing really to notice here except that the code uses File.Contents() to get the data from a single file – I’m not using Folder.Contents().

However, once the report has been published only one data source needs to be set up in the On Premises Data Gateway for it to refresh successfully, even though the report connects to three different files. Here’s a screenshot of the gateway data source I set up in the Power BI service:

Two things to point out:

  • The data source type is set to Folder
  • The full path property is set to the path of the folder that the files used by the report are in, ie C:\Sales Data

Setting up a single gateway data source for a folder is obviously a much better option than setting up multiple data sources for all the files in the folder. Did everyone else know this but me? I guess this is all related to the inheritance of data privacy settings that I blogged about here.

12 thoughts on “Configuring Power BI Gateway Data Sources For Files And Folders

  1. Chris,

    This is very helpful, as I’m about to embark on a journey with similar circumstances.

    I was wondering, how do the data refreshes with with file-based models (I haven’t tested this yet)? For instance, if you have a report that pulls from several files (in which you have a gateway connection to), can you update one of the files, save it, and then go back and refresh your report on PowerBI.com and expect the data to show up?…or does this require a scheduled refresh of the data source? I wasn’t sure if these were live connections to the files or if that was only possible via the Power BI Desktop tool/Report Server.

  2. Matt Allington – Sydney Australia – I am a full time self service Business Intelligence trainer and consultant specialising in Microsoft Power BI, Power Query, and Power Pivot for Excel
    Matt Allington says:

    I had no idea you could do this. Thanks for sharing.

  3. Paul Bunting – Experienced Software Architect with a demonstrated 20+ years’ developing solutions for the manufacturing and architectural design industries. Skilled in the design and creation of advanced analytical systems for real time system monitoring and process improvements. A focussed professional with keen attention to detail and adaptive approach to development and solution design. Specialist in • Enterprise Manufacturing Intelligence • Real Time Processing and Analytics • Big Data Analytics • Machine Learning and Data Science • IoT Analytics
    developtheweb says:

    Reblogged this on Paul Bunting – Blog and commented:
    Configuring Power BI Gateway Data Sources For Files And Folders
    by Chris Webb

    …”building a lot of Power BI reports from csv and Excel files, and to make sure that scheduled refresh works I have been setting up data sources in an On Premises Data Gateway (what used to be called the Enterprise Gateway). I had assumed that if I was connecting to file-based data sources in my Power BI dataset then, in the gateway, I would need to set up one data source for each file that I’m connecting to – which is a bit of a pain. In fact it turns out that you can set up a gateway data source for the folder that the files are in instead”… https://blog.crossjoin.co.uk/2017/07/14/configuring-power-bi-gateway-data-sources-for-files-and-folders/

  4. Bradley Sawler – Australia – I am a mechanical engineering professional in mining and energy. I have a passion for discovering and implementing tools and processes to improve productivity.
    bernoullies says:

    Thanks, Chris. I have used folders for Power Query, but on a similar note with a single source, when I was getting data from multiple worksheets in an excel file I would query the file for each query until I read a post over at Power Pivot Pro. Same benefit as above, query the file once then reference that query and select the worksheets. Brillant.

  5. @Chris Will this work if we have function that import multiple files dynamically? I am getting error message when i try to configure to use the gateway to refresh the data.
    I am giving the folder location and then sorting by the create time and importing last two files dynamically in to the Power BI.
    Error message when i publish and try to configure gateway.
    You can’t schedule refresh for this dataset because one or more sources currently don’t support refresh.

    Thanks

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      It might depend on how you’re combining the data. It should be possible (see https://blog.crossjoin.co.uk/2016/09/19/loading-data-from-multiple-excel-workbooks-into-power-bi-and-making-sure-data-refresh-works-after-publishing/) but there are a lot of things that can go wrong and mean it doesn’t work.

      1. Chris thanks for your response. The blog is about SharePoint importing Excel workbook using Excel.Workbook, while I am importing from the share drive csv document using Csv.Document(File.Contents()))! Will that not be supported through Gateway? The M code for reference, if you can figure out what can be the issue and possibly the workaround :). I have removed the code around transformation for readability

        let
        Source = Folder.Files(#”Folder Path For ECD”),
        #”Filtered Rows” = Table.SelectRows(Source, each Text.StartsWith([Name], “Click_Data_”)),
        #”Sorted Rows” = Table.Sort(#”Filtered Rows”,{{“Date created”, Order.Descending}}),
        #”Removed Columns” = Table.RemoveColumns(#”Sorted Rows”,{“Extension”, “Date accessed”, “Date modified”, “Date created”, “Attributes”, “Content”}),
        #”Renamed Columns” = Table.RenameColumns(#”Removed Columns”,{{“Name”, “File Name”}}),
        #”Added Index” = Table.AddIndexColumn(#”Renamed Columns”, “Index”, 1, 1),
        FilesToImport = Table.SelectRows(#”Added Index”, each [Index] <= NumberOfFilesToImport),
        #"Added Custom" = Table.AddColumn(FilesToImport, "GetClickData", each Csv.Document(File.Contents([Folder Path] & [File Name]),[Delimiter=",", Columns=17, Encoding=1252, QuoteStyle=QuoteStyle.None]))
        in
        #"Added Custom"

  6. I got this working and it need the function to import the binary content of the sample file instead of importing the file and what transformation you do on that file.
    You have function just pointing to the binary content of the file.
    The query will then loop through all the files from that location and import the content in to the Power BI using that function. You then do the transformation after loading the binary content from all the files. These does get refresh through the Gateway.
    Thanks

  7. Hi Chris, thanks for sharing this. I am happy to report that it also works for sub folders, thereby only needing to have the root folder as a data source in the gateway.

  8. I am trying to do the same with only difference that the query used for my report is getting data from folder and combines all files in the folder. I insert the path and it seems something is wrong with the credentials but I cannot figure out how to fix it. here is the error I receive:

    Unable to connect: We encountered an error while trying to connect to . Details: “We could not register this data source for any gateway instances within this cluster. Please find more details below about specific errors for each gateway instance.

    any idea why is this happening?

Leave a Reply to Baiju ThakkarCancel reply