Power BI DirectQuery Best Practices For Snowflake And Other Databases

Recently I collaborated with a number of people from Microsoft and Snowflake to write a blog post on best practices for using Power BI in DirectQuery mode on Snowflake. You can read it here:

https://medium.com/@jerellegainey/best-practices-for-using-power-bi-in-directquery-mode-with-snowflake-bfd1312ca7ab

It builds on what is already in the Power BI guidance documentation for DirectQuery to add some advice specific to Snowflake. It also has a few other tips that are generally applicable to all DirectQuery sources and which aren’t in the guidance docs (yet), such as the importance of setting the Maximum Connections Per Data Source property (which I also blogged about recently here) and the fact you can increase this to 30 in a Premium capacity, as well as the importance of always testing DirectQuery performance in the Power BI Service rather than in Power BI Desktop. As a result it’s worth reading if you are thinking of using Power BI in DirectQuery mode with Synapse, BigQuery or any other source.

If you are considering using DirectQuery on a project I have one last piece of advice: think very carefully about why you need to use DirectQuery and not Import mode. Many people don’t and end up in trouble – in my opinion Import mode should be the default choice simply because it will almost always perform better than DirectQuery mode, whatever back-end database you’re using.

9 thoughts on “Power BI DirectQuery Best Practices For Snowflake And Other Databases

  1. Great article! To illustrate the point on “think very carefully about why you need to use DirectQuery and not Import mode”… I’m kicking the tires on a new Snowflake database. Thinking that a table was small, I switched it to import to see if I could speed up a report. 90 minutes and 145 million records later, the table loaded. I’m shocked that it worked and there’s no way I’m keeping a 3 gigabyte Power BI Desktop file, but it opens the door to some interesting options with incremental refresh and Power BI Premium if we need the best possible performance. For now we are switching back to DirectQuery and we know why!

  2. Hey Chris, I’m a long time lurker of your awesome articles.

    I’m curious if you’ve done anything with Progress OpenEdge Databases and if you have any suggestions/tips on composite modelling with them (I know it isn’t supported out of the box) or best practices.

    Progress’s own DirectQuery connector leaves much to be desired. I do have ODBC access (direct SQL queries work great, folded… not so much). Thanks again for the time you might give to my question, and thanks again for your awesome content!

  3. Chris, we have a dashboard that requires some metrics to be real-time, because of that we have to use Direct Query. So my developer moved everything in the dataset to Direct Query and it tested fine but come Monday morning when 300 users tried to access the performance was horrendous! We use Snowflake for back-end and could see that the queries hitting Snowflake weren’t taxing Snowflake at all but performance was still awful in PBI. We even upped the Fabric capacity to 128 last week as a preventative measure but didn’t help. Any suggestions on what we do? We have to have some of the metrics be real-time – does the entire dataset need to be in Direct Query mode?
    THANKS!!

    1. The first question to ask is always: do you really need realtime? Because users say they do and often don’t; even a latency of a few minutes can be handled easily in Import mode with careful partitioning. If you do need DirectQuery you can create a composite model where some tables are in Import mode and some in DirectQuery. Finally, increasing the capacity isn’t going to help much unless you have increased the number of connections back to the source: https://blog.crossjoin.co.uk/2024/05/19/diagnosing-power-bi-directquery-connection-limit-performance-problems-with-execution-metrics/

    2. You might want to check whether the SQL queries being generated are optimized are they performing unnecessary null checks on unused columns, for example?

      Here are some ways to boost performance:
      Schema model
      Use primary and foreign keys to organize data better.
      Create indexes to speed up queries.
      Use partitioning and clustering to structure data efficiently.
      Use dynamic m parameters .
      Simplify DAX calculations as much as possible.
      Do not use nested if statements, use switch instead
      Remove implicit measures and use explicit ones.
      Set columns as non-nullable in Power BI to reduce unnecessary checks.

      There are more ways to improve performance, but these are a good start. Let me know if you need help! I should probably write a blog post about this.

Leave a Reply