Power BI Sentinel: Backup, Documentation, Change Tracking And Lineage Tracking For Power BI

A few weeks ago at SQLBits I had a demo of a very interesting new tool for Power BI users called Power BI Sentinel. The website, with all the details, is here:

https://www.powerbisentinel.com

It helps solve several problems that everyone managing a Power BI deployment has to deal with. It can:

  • Backup reports and datasets (as .pbix files) direct from your App Workspaces to Azure Blob storage on a schedule, so you are able to access earlier versions and roll back if you need to.
  • Generate documentation on your datasets, including the DAX calculations used.
  • Identify what visuals and filters have changed in different versions of a report, and when those changes took place.
  • Track which data sources are used by which datasets and, in turn, which reports use those datasets.

image

It’s still a very new tool and adding functionality all the time; there is also a lot of functionality that you’d want from a tool like this that Microsoft hasn’t built the APIs to support yet. It’s definitely worth checking out though and (in my opinion) quite reasonably priced. I dare say Microsoft will build some of this functionality into Power BI at some point, but I don’t know what it will deliver exactly or when that will happen.

Full disclosure: I’ve known the owners of the company for a long time through their involvement with the UK SQL Server and Power BI community, and I was given a free licence for Power BI Sentinel by them.

Defining Relationships Between Entities In The Common Data Model To Automatically Create Relationships In Power BI

Following on from my last post on attaching manually-created Common Data Model folders in Power BI dataflows, I was asked whether defining relationships between entities in the model.json file of the CDM folder results in relationships being created between tables in a Power BI dataset. I’ve just tested it, and I can confirm it does.

Consider a CDM folder that contains two entities, Sales and Fruit. The Sales entity contains monthly sales data:

image

…and the Fruit entity contains a list of fruit sold:

image

Now –and this is important – let’s say you have defined the entities in the model.json so that the Sales entity  has attributes Month, Product and Sales:

image

…and the Fruit entity has a single attribute called Fruit:

image

Because the Sales entity and the Fruit entity use different names for the attributes that contain the names of the fruit sold, when the two entities are loaded into Power BI no relationships are created between the two tables. There’s no way that Power BI can guess that there should be a relationship between the Product and Fruit columns based on these names.

However, if you add a relationship definition (as documented here) to the model.json file like so:

image

Then, when you load the two entities as tables in the same Power BI dataset, you get a relationship created automatically:

image

As far as I can see it does this by adding a key definition to the Sales table in a similar way to the way the Table.AddKeyM M function/Remove Duplicates transformation does, as I blogged here.

Now if we could add DAX calculations (calculated columns and especially measures) to the definition of the model.json file, so they were automatically created when entities were imported from a dataflow, that would be really cool. I don’t see why this would not be possible because you could store all kinds of useful information – such as the DAX for the calculations – in the metadata record of an M query (you can access this yourself using the Value.Metadata M function) loaded into a dataset, and the Power BI engine could read it when the table is loaded and create the calculations from this.

Attaching Manually-Created Common Data Model Folders In Power BI Dataflows

Most of the focus on Power BI dataflows so far has been on the use of Power Query Online to load data from external data sources. However, when you create a dataflow you also have the option to attach an existing Common Data Model (CDM) folder stored in Azure Data Lake Storage Gen2:

image

The documentation has some very detailed information on this subject: you can read how to configure Power BI to use your own Azure Data Lake Storage Gen2 account here, and you can read about how to attach an existing CDM folder here.

The interesting thing for me is that this means that any external service – whether it’s a Microsoft service, a third party service or something you build yourself – can output data into a folder in Azure Data Lake Storage Gen2, and so long as it’s in the right format you can expose this data as a dataflow in Power BI. So, I thought, why not create some data manually, put it in the cloud and see if I can get Power BI to attach it as a dataflow?

It turns out it isn’t too difficult to do, and what’s more when you look at the documentation for the Common Data Model and specifically the documentation for the format of the model.json file there are all kinds of interesting features to check out.

Consider the following folder containing three files, all created manually and uploaded to my Azure Data Lake Storage Gen2 account using Azure Storage Explorer:

image

The two csv files are pretty simple:

image

image

The model.json file, which contains the metadata for the dataflow, is also not that complicated:

image

This model.json file describes a single entity called Sales with three columns: Month, Product and Sales. It also specifies that the data for this entity is stored in two partitions and these partitions are linked to the two csv files above (I’m pretty sure this is how incremental refresh is handled in dataflows too). I like the idea of another application being able to save new csv files to a folder and, after updating the model.json file to add a new partition, this data being appended onto the table of data returned by the entity.

It took a few goes to attach the folder to a dataflow, mostly because when it fails the error messages (at least at the time of writing) are completely unhelpful. Two things I ran into are:

  • Any spaces in folder or file paths need to be url encoded, ie replaced with %20
  • You really do need to grant permissions on the folder to the Power BI users who are going to access the dataflow. RTFM.

After that, the dataflow is visible in the Power BI service:

image

…and the data from the Sales entity can be loaded into your dataset in Power BI Desktop:

image

image

As I said, there’s more for me to research here. For example, looking in the documentation, you can create two types of entities – Local and Referenced (are the latter the same as Linked Entities?) – and you can also define relationships between entities (do these become relationships between tables in Power BI?). I think this opens up a lot of new possibilities for close integration between Power BI and other services!

Book Review: “Collect, Combine And Transform Data Using Power Query In Excel And Power BI” by Gil Raviv

I generally try to avoid writing book reviews here, but the fact that there are so few books available on Power Query and M means that I’m making an exception for “Collect, Combine and Transform Data using Power Query in Excel and Power BI” by Gil Raviv.

The first thing to say about this book is that it takes the approach of teaching through exercises and worked examples, rather than by explaining abstract concepts. If this suits your style of learning (and I know that it does for a lot of people) then you’re in luck; if you’re looking for a book that will explain what all the different join types for Merge operations do, for example, then you’ll be disappointed. This isn’t a criticism though – I don’t think it’s possible to write a book that will satisfy everyone – and Gil has done a good job of covering a lot of common data preparation scenarios. One important exception to this is the chapter on M which provides a very clear introduction to the language and the way it works. I suspect a lot of people will want to buy the book for this chapter alone.

The second thing to say about this book is that while it covers both Power Query in Excel and Power BI, in my opinion it’s aimed slightly more at Excel users. Again, this is not a criticism: although advanced Excel users and Power BI report designers have to solve many of the same problems, they also have some very different concerns too. What’s more, if you can assume your readers have good Excel skills and can explain Power Query concepts in Excel terms then you’ll serve that particular set of readers well, and probably do a better job for them than if you assume they are completely new to the area of data transformation and preparation have no existing skills in this area.

All in all it’s a good book that I can recommend to anyone who wants to learn Power Query and M, and also for intermediate users who want to deepen their knowledge. I still think there’s a need for a book completely devoted to M and covering topics such as custom connectors and dataflows; hopefully someone writes one soon!

Full disclosure: I received a free copy of the book from the author. I’m also the author of a Power Query book myself, but to be honest it’s several years old now and a bit out-of-date, so it’s hard to recommend it any more.

You can buy a copy of this book from Amazon UK here.

SQLBits Power BI And Analysis Services Videos Now Free To View Online

SQLBits is one of the best Microsoft data platform conferences around, and last week’s event in Manchester was particularly good. As usual, videos of almost all of the sessions are available for everyone to view for free online (no registration required) here:

https://sqlbits.com/content/Event18

There were lots of Power BI and Analysis Services related sessions, so I thought I’d call out a few:

  • If you’re interested in the new calculation groups feature in SSAS 2019 that I blogged about last week, you should definitely watch Christian Wade’s two-part session here and here (part two has all the juicy details in), which also gives some details about other upcoming features such as XMLA endpoints. Kasper’s session here covers a lot of the same topics.
  • There’s more insight into Microsoft’s Power BI roadmap and thinking in the Q&A session with Christian, Kasper and Adam here
  • Marco and Alberto always do great sessions, and Alberto’s session on Aggregations here and Marco’s session on many-to-many relationships here are up to their usual high standards.
  • My session on Power BI Dataflows here sums up my current thoughts about them.

Of course there’s lots more there (more than I have had a chance to watch) so let me know if there are other sessions that are good!

SSAS Tabular 2019, Calculation Groups And Migration From SSAS Multidimensional

With the release of CTP 2.3 of SQL Server 2019 today there was big news for Analysis Services Tabular developers: Calculation Groups. You can read all about them in detail in this blog post:

https://blogs.msdn.microsoft.com/analysisservices/2019/03/01/whats-new-for-sql-server-2019-analysis-services-ctp-2-3/

In my opinion this is the most important new feature in DAX since… well, forever. It allows you to create a new type of calculation – which in most cases will be a time intelligence like a year-to-date or a previous period growth – that can be applied to multiple measures; basically the same thing that we have been doing in SSAS Multidimensional for years with the time utility/shell/date tool dimension technique. It’s certainly going to solve a lot of problems for a lot of SSAS Tabular implementations, many of which have hundreds or even thousands of measures for every combination of base measure and calculation type needed.

I’m not going to repeat any of the detailed technical information in the blog post here, though. Instead the point I want to make is that this is very big news for SSAS Multidimensional users too. In the past couple of years many people who have existing SSAS Multidimensional implementations have thought about migrating to SSAS Tabular so they can take advantage of its new features or move to the cloud, and indeed many of them have already migrated successfully. However, up to now, the biggest blocker for those wanting to migrate from Multidimensional to Tabular has been the fact that some complex calculations that can be expressed in MDX cannot be recreated (or recreated easily and efficiently) in DAX, because DAX has not had an equivalent of calculated members not on the Measures dimension or the MDX SCOPE statement.

Calculation groups do not remove this roadblock completely, but they do remove the roadblock for a large group of existing SSAS Multidimensional users whose only complex calculation requirement is a time utility/shell/date tool dimension. As a result these SSAS Multidimensional users will now be able to migrate to SSAS Tabular 2019, Azure Analysis Services or Power BI if they want to. Only those people who have more exotic uses for calculated members not on the Measures dimension (which are not very common at all) and those who use SCOPE statements (a larger group – many people working with financial data use SCOPE statements heavily) will find that Multidimensional is still the correct platform for them.

%d bloggers like this: