It’s very easy to get confused between a Live connections and DirectQuery mode in Power BI: the customers I work with frequently get them mixed up, partly because they are similar concepts, partly because other BI tools use the term “live connection” to refer to what Power BI calls DirectQuery mode. In this post I will explain the differences between these two terms.
Power BI Datasets, Azure Analysis Services and SQL Server Analysis Services
The first thing to understand is that when a Power BI report runs it can only get data from one of three places: a Power BI dataset; an Azure Analysis Services (AAS) database; or an on-premises instance of SQL Server Analysis Services(SSAS). There are some exceptions to this rule but we can ignore them for the purposes of this post. Each of these three sources run variations of the same engine, ie the Analysis Services engine, and can be queried in DAX. These sources may in turn get data from other sources when they are queried but as far as the Power BI report is concerned the only sources it can talk to directly are Power BI datasets, AAS databases and SSAS databases.
Normally, when you open up Power BI Desktop and start developing, your .pbix file contains the definition of a Power BI dataset and a report that connects to it. When you publish this .pbix file they are split up and you see separate dataset and report objects in your Power BI workspace.
However, in Power BI Desktop you may instead choose to connect to a Power BI dataset that has already been published to the Power BI Service, or an instance of AAS or SSAS, and use that as the source for the report. The main advantage of doing this is that different developers can work on the dataset/AAS database/SSAS database definition and the report at the same time. When you connect to an already-published dataset or an AAS or SSAS database from Power BI Desktop you are creating a Live connection.
To connect to a published Power BI dataset and create a Live connection choose the “Power BI datasets” option under the Get Data menu:
If you want to create a Live connection to AAS or SSAS choose the Analysis Services source and then make sure the “Connect live” option is selected:
After you’ve created the Live connection you’ll see a message telling you this in the bottom right-hand corner of the screen in Power BI Desktop:
The key thing to understand is that when you create a Live connection your .pbix file no longer contains a dataset definition – it only contains a report, and that report connects to a dataset, AAS database or SSAS database that already exists elsewhere.
The concept of a Live connection only really makes sense in Power BI Desktop: as I said, in the Power BI Service reports and datasets are separate objects anyway. A report can only have one Live connection because a report can only ever connect to a single dataset, AAS database or SSAS database.
If you publish a report with a Live connection to an already-published Power BI dataset, then the only new thing that will appear in the Power BI Service is the report . If you publish a report with a Live connection to AAS or SSAS then you will see a new dataset appear in the Power BI Service but this isn’t really a dataset, it’s really just a pointer to the AAS or SSAS database where the data is really being held.
There are two fundamental ways a Power BI dataset, AAS database or SSAS Tabular database can work with data: Import mode and DirectQuery mode. In Import mode the data needed for your dataset is copied into the dataset and stored inside it. In DirectQuery mode, when a Power BI report requests data from the dataset by running a DAX query against it, the dataset in turn generates queries that are run on your data source to get the data it needs on demand. If your data source is a SQL Server database then Power BI will generate SQL queries to get the data it needs; for other sources it may generate queries in other query languages. A Power BI dataset is always a semantic layer, an easily consumable interface for your BI data: in Import mode it is a semantic layer and it also stores the data it needs; in DirectQuery mode it is just a semantic layer.
The only data sources that Power BI can use in DirectQuery mode are those that have some kind of query language that Power BI can generate. You can’t connect to a CSV file in DirectQuery mode, for example, because there’s no way of running queries against it unless you load the data in it into some other form of database. Other data sources may have rich query languages but Power BI will only be able to connect to them in DirectQuery mode if a suitable connector exists for that source. Most commonly-used relational databases can be used in DirectQuery mode. If your data source supports DirectQuery mode you’ll see that as an option on the dialog when you first connect:
The key thing to understand here is that the term “DirectQuery mode” describes where the data for a dataset is kept, ie outside the dataset as opposed to inside it in Import mode.
Different tables in a single Power BI dataset can have different storage modes: some can be in Import mode and some can be in DirectQuery mode. Furthermore, different DirectQuery tables in the same Power BI dataset can connect to different data sources. When you have different tables using different storage modes or different tables using different DirectQuery datasets then you have what is called a “composite model“.
DirectQuery on Power BI datasets and Analysis Services
One final twist is that you can create Power BI datasets in DirectQuery mode and use other Power BI datasets, AAS databases or SSAS databases, as a data source. This means you can have DirectQuery datasets that connect to other datasets that can potentially connect to other datasets. It’s useful because, for example, you can have a single, centrally-developed dataset containing all of your company’s sales data and then other Power BI developers can create their own datasets that connect to this central dataset and extend it by adding extra tables of data and new measures without ever duplicating the data in the central dataset.
Although the way things behave behind the scenes is somewhat different, a Power BI dataset in DirectQuery mode that connects to another Power BI dataset as its source is conceptually very similar to a Power BI dataset in DirectQuery mode that connects to a relational database like SQL Server: it just generates DAX queries to get the data it needs when it is queried rather than SQL queries.
Live connections and DirectQuery mode are two distinct concepts. Live connections occur when you create reports in Power BI Desktop that connect to already-published Power BI datasets, AAS databases or SSAS databases. DirectQuery mode is one of the two fundamental ways that a Power BI dataset can work with data (the other being Import mode) and is the mode where a dataset does not contain any data but instead queries other data sources to get the data it needs when it is itself queried.
[Thanks are due to my friend Peter Myers – this blog post expands on some explanations we developed last year in a session we co-presented at various Power BI user groups called “Demystifying Power BI Datasets”. Several recordings of this session exist on YouTube, for example here.]