Overhead Of Getting Relationship Columns In Power BI DirectQuery Mode

Many Power BI connectors for relational databases, such as the SQL Server connector, have an advanced option to control whether relationship columns are returned or not. By default this option is on. Returning these relationship columns adds a small overhead to the time taken to open a connection to a data source and so, for Power BI DirectQuery semantic models, turning this option off can improve report performance slightly.

What are relationship columns? If you connect to the DimDate table in the Adventure Works DW 2017 sample database using Power Query, you’ll see then on the right-hand side of the table. The following M code:

let
Source = Sql.Database("localhost", "AdventureWorksDW2017"),
dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data]
in
dbo_DimDate

…shows the relationship columns:

Whereas if you explicitly turn off the relationships by deselecting the “Including relationship columns” checkbox:

…you get the following M code with the CreateNavigationProperties property set to false:

let
Source = Sql.Database("localhost", "AdventureWorksDW2017", [CreateNavigationProperties=false]),
dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data]
in
dbo_DimDate

…and you don’t see those extra columns.

How much overhead does fetching relationship columns add? It depends on the type of source you’re using, how many relationships are defined and how many tables there are in your model (because the calls to get this information are not made in parallel). It’s also, as far as I know, impossible to measure the overhead from any public telemetry such as a Profiler trace or to deduce it by looking at the calls made on the database side. The overhead only happens when Power BI opens a connection to a data source and the result is cached afterwards, so it will only be encountered occasionally and not for every query that is run against your data source. I can say that the overhead can be quite significant in some cases though and can be made worse by other factors such as a lack of available connections or network/gateway issues. Since I have never seen anyone actually use these relationship columns in a DirectQuery model – they are quite handy in Power Query in general though – you should always turn them off when using DirectQuery mode.

[Thanks to Curt Hagenlocher for the information in this post]

6 thoughts on “Overhead Of Getting Relationship Columns In Power BI DirectQuery Mode

  1. Would you expect there to be a performance difference between disabling them in the connector or removing them in a later step, or will they end up the same?

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      You will need to disable them in the connector. If you remove them in a later step you’ll still have the overhead I think.

  2. @Chris – How much would it have costed you if you were not a M$ employee but a consultant to try Copilot and write a blog post about it

    Cheers
    Sam

Leave a ReplyCancel reply