Configuring Power BI Gateway Data Sources For Files And Folders

Recently I’ve been building a lot of Power BI reports from csv and Excel files, and to make sure that scheduled refresh works I have been setting up data sources in an On Premises Data Gateway (what used to be called the Enterprise Gateway). I had assumed that if I was connecting to file-based data sources in my Power BI dataset then, in the gateway, I would need to set up one data source for each file that I’m connecting to – which is a bit of a pain. In fact it turns out that you can set up a gateway data source for the folder that the files are in instead.

Let me give you an example. Imagine that you have three Excel files in a folder called C:\Sales Data:

image

Now imagine that you have three queries in Power BI that get data from these three files:

image

Here’s an example of the M code for one of these queries:

[sourcecode language=”text” padlinenumbers=”true”]
let
Source =
Excel.Workbook(
File.Contents("C:\Sales Data\SalesData_1.xlsx")
, null, true),
SalesDataTable_Table =
Source{[Item="SalesDataTable",Kind="Table"]}[Data]
in
SalesDataTable_Table
[/sourcecode]

There’s nothing really to notice here except that the code uses File.Contents() to get the data from a single file – I’m not using Folder.Contents().

However, once the report has been published only one data source needs to be set up in the On Premises Data Gateway for it to refresh successfully, even though the report connects to three different files. Here’s a screenshot of the gateway data source I set up in the Power BI service:

image

Two things to point out:

  • The data source type is set to Folder
  • The full path property is set to the path of the folder that the files used by the report are in, ie C:\Sales Data

Setting up a single gateway data source for a folder is obviously a much better option than setting up multiple data sources for all the files in the folder. Did everyone else know this but me? I guess this is all related to the inheritance of data privacy settings that I blogged about here.

12 thoughts on “Configuring Power BI Gateway Data Sources For Files And Folders

  1. Chris,

    This is very helpful, as I’m about to embark on a journey with similar circumstances.

    I was wondering, how do the data refreshes with with file-based models (I haven’t tested this yet)? For instance, if you have a report that pulls from several files (in which you have a gateway connection to), can you update one of the files, save it, and then go back and refresh your report on PowerBI.com and expect the data to show up?…or does this require a scheduled refresh of the data source? I wasn’t sure if these were live connections to the files or if that was only possible via the Power BI Desktop tool/Report Server.

    1. you have to do scheduled refresh for your data to show up in Power BI report. There is no live connection to excel (file) data source.

  2. Reblogged this on Paul Bunting – Blog and commented:
    Configuring Power BI Gateway Data Sources For Files And Folders
    by Chris Webb

    …”building a lot of Power BI reports from csv and Excel files, and to make sure that scheduled refresh works I have been setting up data sources in an On Premises Data Gateway (what used to be called the Enterprise Gateway). I had assumed that if I was connecting to file-based data sources in my Power BI dataset then, in the gateway, I would need to set up one data source for each file that I’m connecting to – which is a bit of a pain. In fact it turns out that you can set up a gateway data source for the folder that the files are in instead”… https://blog.crossjoin.co.uk/2017/07/14/configuring-power-bi-gateway-data-sources-for-files-and-folders/

  3. Thanks, Chris. I have used folders for Power Query, but on a similar note with a single source, when I was getting data from multiple worksheets in an excel file I would query the file for each query until I read a post over at Power Pivot Pro. Same benefit as above, query the file once then reference that query and select the worksheets. Brillant.

  4. @Chris Will this work if we have function that import multiple files dynamically? I am getting error message when i try to configure to use the gateway to refresh the data.
    I am giving the folder location and then sorting by the create time and importing last two files dynamically in to the Power BI.
    Error message when i publish and try to configure gateway.
    You can’t schedule refresh for this dataset because one or more sources currently don’t support refresh.

    Thanks

      1. Chris thanks for your response. The blog is about SharePoint importing Excel workbook using Excel.Workbook, while I am importing from the share drive csv document using Csv.Document(File.Contents()))! Will that not be supported through Gateway? The M code for reference, if you can figure out what can be the issue and possibly the workaround :). I have removed the code around transformation for readability

        let
        Source = Folder.Files(#”Folder Path For ECD”),
        #”Filtered Rows” = Table.SelectRows(Source, each Text.StartsWith([Name], “Click_Data_”)),
        #”Sorted Rows” = Table.Sort(#”Filtered Rows”,{{“Date created”, Order.Descending}}),
        #”Removed Columns” = Table.RemoveColumns(#”Sorted Rows”,{“Extension”, “Date accessed”, “Date modified”, “Date created”, “Attributes”, “Content”}),
        #”Renamed Columns” = Table.RenameColumns(#”Removed Columns”,{{“Name”, “File Name”}}),
        #”Added Index” = Table.AddIndexColumn(#”Renamed Columns”, “Index”, 1, 1),
        FilesToImport = Table.SelectRows(#”Added Index”, each [Index] <= NumberOfFilesToImport),
        #"Added Custom" = Table.AddColumn(FilesToImport, "GetClickData", each Csv.Document(File.Contents([Folder Path] & [File Name]),[Delimiter=",", Columns=17, Encoding=1252, QuoteStyle=QuoteStyle.None]))
        in
        #"Added Custom"

  5. I got this working and it need the function to import the binary content of the sample file instead of importing the file and what transformation you do on that file.
    You have function just pointing to the binary content of the file.
    The query will then loop through all the files from that location and import the content in to the Power BI using that function. You then do the transformation after loading the binary content from all the files. These does get refresh through the Gateway.
    Thanks

  6. Hi Chris, thanks for sharing this. I am happy to report that it also works for sub folders, thereby only needing to have the root folder as a data source in the gateway.

  7. I am trying to do the same with only difference that the query used for my report is getting data from folder and combines all files in the folder. I insert the path and it seems something is wrong with the credentials but I cannot figure out how to fix it. here is the error I receive:

    Unable to connect: We encountered an error while trying to connect to . Details: “We could not register this data source for any gateway instances within this cluster. Please find more details below about specific errors for each gateway instance.

    any idea why is this happening?

Leave a Reply to bernoulliesCancel reply