Defining Relationships Between Entities In The Common Data Model To Automatically Create Relationships In Power BI

Following on from my last post on attaching manually-created Common Data Model folders in Power BI dataflows, I was asked whether defining relationships between entities in the model.json file of the CDM folder results in relationships being created between tables in a Power BI dataset. I’ve just tested it, and I can confirm it does.

Consider a CDM folder that contains two entities, Sales and Fruit. The Sales entity contains monthly sales data:

image

…and the Fruit entity contains a list of fruit sold:

image

Now –and this is important – let’s say you have defined the entities in the model.json so that the Sales entity  has attributes Month, Product and Sales:

image

…and the Fruit entity has a single attribute called Fruit:

image

Because the Sales entity and the Fruit entity use different names for the attributes that contain the names of the fruit sold, when the two entities are loaded into Power BI no relationships are created between the two tables. There’s no way that Power BI can guess that there should be a relationship between the Product and Fruit columns based on these names.

However, if you add a relationship definition (as documented here) to the model.json file like so:

image

Then, when you load the two entities as tables in the same Power BI dataset, you get a relationship created automatically:

image

As far as I can see it does this by adding a key definition to the Sales table in a similar way to the way the Table.AddKeyM M function/Remove Duplicates transformation does, as I blogged here.

Now if we could add DAX calculations (calculated columns and especially measures) to the definition of the model.json file, so they were automatically created when entities were imported from a dataflow, that would be really cool. I don’t see why this would not be possible because you could store all kinds of useful information – such as the DAX for the calculations – in the metadata record of an M query (you can access this yourself using the Value.Metadata M function) loaded into a dataset, and the Power BI engine could read it when the table is loaded and create the calculations from this.

10 responses

    • You do not need Premium to do this – you can attach an external CDM folder when your workspace is in shared capacity. After that, all your end users need is access to the workspace.

  1. Great job puzzling this out. I don’t think the CDM standard will include ability to define DAX calculations since CDM is a general entity specification used by multiple Microsoft services. Although, having the ability to define custom metadata in the spec and then stuffing DAX calculations in it and then auto-generating the Power BI model has some possibilities.

    Chris, do you think dataflows is the mechanism to accomplish the equivalent of cube write-backs? We’ve just been missing a way to write data from Power BI and now we have it and with multiple files in one folder you can even call it partition-based write-backs. For example, if you have a forecast dataset from source system and we create a forecast_override dataset from a spreadsheet that will allow what-if analysis and combining the two data sets for Power BI reporting will yield most of the benefit of writeback.

    • Interestingly the model.json files for dataflows created in Power BI do include the M code for queries used, so I don’t see why they couldn’t include DAX too.

      Re writeback… Hmm, interesting idea. Maybe it is possible but I would need to think about it.

  2. Pingback: Jump Start to Power BI Data Flows – datadriven.pro

  3. Pingback: Self-Service-Datenaufbereitung für Big Data in Power BI (Dataflows) – datadriven.pro

  4. I did some tests with the DataFlow integrated in Azure DataLake Gen2, and I must say that it is quite promising.

    However, about the possibility of creating measurements in DAX, I still don’t see anything on the subject.
    In the CDM model, there is also nothing about the default aggregation option on an attribute. That would be a good start, wouldn’t it?
    Isn’t there a fork of CDM, dedicated to data analysis?

    Otherwise, concerning the management of file partitions for an entity, I notice that it is very necessary to have read rights on all the files of the entity, otherwise the DataFlow returns an access error.
    Too bad, this would have made it possible to manage security concepts in an entity, by granting rights on only part of the files.

  5. Chris, have you found this to still be accurate? I’ve been messing with a CDM schema published to Power BI as a dataflow, yet when I connect to the dataflow the detect relationships isn’t working as expected.

    • I’ve actually verified that your example works. However, I have an example with many more entities that has these relationships defined that is not behaving in the same manner.

      In this case the system writing the CDM schema uses the ROW_UNIQUEKEY for as the ID field for all entities. Power BI is getting confused and linking these fields rather than the defined relationships. I’m curious if you’ve seen this behavior.

Leave a Reply to MikeAinOz Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: