Chris Webb's BI Blog: Using the Resource Usage Profiler Event For SSAS Multidimensional Query Performance Tuning, Part 2: Many-To-Many Relationships And Non-Materialised Referenced Relationships

You probably know that using many-to-many relationships or non-materialised referenced relationships can be bad for Analysis Services Multidimensional query performance. How can you measure their impact, though? In the first post in this series I showed how the Resource Usage Profiler event could be used to to monitor Storage Engine activity; in this post I’ll use it to show the effect of using these features on the amount of Storage Engine activity that takes place during query execution.

Many-to-many relationships

In my previous post I built a very simple SSAS cube with one measure group called Fact from a table with 5000 rows, then built a dimension called ID from that same table with 5000 members on its only hierarchy. For this post I added a second measure group called Fact Bridge based on the same table, added a new role-playing copy of the existing dimension called M2M ID, and then created a many-to-many relationship from this new dimension to the original measure group via the new measure group.

Because all the dimensions and measure groups are built from the same table, one member on the M2M ID dimension is linked to just one member on the ID dimension, so selecting a member on the M2M ID dimension will give the same result as selecting the member with the same key on the ID dimension. Even though SSAS see a many-to-many relationship, in the data it’s a one-to-one relationship.

As I showed in my last post, selecting one member from the ID dimension in a query like the following:

[sourcecode language=”text” padlinenumbers=”true”]
select
{[Measures].[My Measure]}
on 0,
{[ID].[ID].&[1]}
on 1
from
[M2M Test]
[/sourcecode]

…results in a single page of 256 rows being read and a single row being returned from the cube, as shown by the Resource Usage event in a Profiler trace:

If the query is changed to use the M2M ID dimension instead:

[sourcecode language=”text” highlight=”4″]
select
{[Measures].[My Measure]}
on 0,
{[M2M ID].[ID].&[1]}
on 1
from
[M2M Test]
[/sourcecode]

…the same result is returned, because the member [M2M ID].[ID].&[1] is associated with one member from the [ID].[ID] hierarchy, the member [ID].[ID].&[1], via the intermediate measure group. However the Resource Usage event shows something very different:

The ROWS_SCANNED value has gone from 256 to 5256, and ROWS_RETURNED has gone from 1 to 5003! Why? Part of the explanation is that we now have two measure groups that must be scanned, and the Resource Usage statistics are totals for all Storage Engine activity across all measure groups. In this query the Fact Bridge measure group is scanned first to resolve the many-to-many relationship between the M2M ID and ID dimensions, and then the Fact measure group is scanned to get the value for the measure My Measure. The Fact Bridge measure group only accounts for 256 rows scanned and 1 row returned though, the remaining 5000 rows scanned are from the main Fact measure group. The problem here is that SSAS does not translate the filter on the M2M ID dimension into a filter on the ID dimension (this is a limitation of the way SSAS handles many-to-many) so all the rows on the main Fact measure group get scanned in this query.

This explains something that I have blogged about before here, namely that if you partition your measure group by a dimension that is used in a many-to-many relationship you’ll see that all partitions are scanned and not just the partitions you expect to be scanned. The Resource Usage event shows that even when you don’t see unexpected partition scans happening, using a many-to-many relationship in a query can result in a lot of extra Storage Engine activity and therefore potentially worse query performance.

Non-materialised referenced relationships

Something similar happens when you use non-materialised referenced relationships (although materialised referenced relationships are OK). To test this I created another variation on my original cube, with just one measure group and the ID dimension as before but now with a new, role-playing instance of the ID dimension joining to the measure group through the ID dimension using a non-materialised referenced relationship.

The following query returns the same result as the two queries above:

[sourcecode language=”text” padlinenumbers=”true”]
select {[Measures].[My Measure]} on 0,
{[Ref ID].[ID].&[1]}
on 1
from
[RefDimTest]
[/sourcecode]

…but again, the Resource Usage event shows the entire measure group is being scanned when this query runs:

I’m not much of a fan of referenced relationships anyway – you can usually get rid of them by redesigning your SSAS dimensions or your underlying dimensional model – so this one more reason not to use them.

One thought on “Using the Resource Usage Profiler Event For SSAS Multidimensional Query Performance Tuning, Part 2: Many-To-Many Relationships And Non-Materialised Referenced Relationships”

Pingback: Using the Resource Usage Profiler Event For SSAS Multidimensional Query Performance Tuning, Part 3: Aggregations And Indexes – Chris Webb's BI Blog

Using the Resource Usage Profiler Event For SSAS Multidimensional Query Performance Tuning, Part 2: Many-To-Many Relationships And Non-Materialised Referenced Relationships

Like this:

Published by Chris Webb

One thought on “Using the Resource Usage Profiler Event For SSAS Multidimensional Query Performance Tuning, Part 2: Many-To-Many Relationships And Non-Materialised Referenced Relationships”

Leave a ReplyCancel reply

Share this:

Like this:

Published by Chris Webb

One thought on “Using the Resource Usage Profiler Event For SSAS Multidimensional Query Performance Tuning, Part 2: Many-To-Many Relationships And Non-Materialised Referenced Relationships”

Leave a ReplyCancel reply

Discover more from Chris Webb's BI Blog