Formula Caching and Query Scope

The Analysis Services formula engine’s ability to cache the results of calculated members can be crucial to the overall performance of your cube, but it’s all too easy to turn off this caching. I’ve blogged already about the impact that subselects have on this and today I thought I’d discuss how you need to be careful using different calculation contexts. This is a topic that’s covered in the Analysis Services 2008 Performance Guide, but I thought it would be worth talking about here because it does deserve a lot more visibility as an important cause of poor query performance, especially in SSRS reports with hand-coded MDX.

As you know, you can define calculated members in three different places in Analysis Services: on the cube (the global context), within the session (the session context) and in the WITH clause of a query (the query context). The important paragraph from the Performance Guide is this one: 

If the query processor encounters calculations created at query time, it always uses the query context, even if a query also references calculations from the global context (there is an exception to this – queries with query calculated members of the form Aggregate(<set>) do share the session cache). If there are no query calculations, but there are session calculations, the query processor uses the session cache. The query processor selects the cache based on the presence of any calculation in the scope. This behavior is especially relevant to users with MDX-generating front-end tools. If the front-end tool creates any session calculations or query calculations, the global cache is not used, even if you do not specifically use the session or query calculations.

What does this mean in practical terms though? Consider the following query on Adventure Works that uses a calculated measure defined on the cube, Internet Ratio to Parent Product:

SELECT [Measures].[Internet Ratio to Parent Product] ON 0,
[Product].[Product Categories].MEMBERS ON 1
FROM [Adventure Works]

Clear the cache and run the query and you’ll see the normal stuff taking place if you run a Profiler trace. Then, if you rerun the query on a warm cache, you should see something like this in Profiler (if you are looking at the Get Data From Cache event):

image

Clearly the result of the calculation has been cached, and the second time you run the query you’re retrieving values from the formula engine cache. Now, consider the following query:

WITH MEMBER MEASURES.TEST AS 1
SELECT [Measures].[Internet Ratio to Parent Product] ON 0,
[Product].[Product Categories].MEMBERS ON 1
FROM [Adventure Works]

We’re now creating a calculated measure in the WITH clause but not using it in the query. What happens when we run this query on a warm cache?

image

We’re still hitting cache, but it’s the storage engine cache and not the formula engine cache; the calculations are being re-evaluated for the query. So, even the presence of a calculated measure in the WITH clause, even though we’re not actually using it, prevents the use of the formula engine cache and if we’ve got complex calculations used in our query this could make our overall query performance significantly worse.

Note that the presence of a named set in the WITH clause does not have the same effect, so the following query can use the formula cache:

WITH SET TEST AS {}
SELECT [Measures].[Internet Ratio to Parent Product] ON 0,
[Product].[Product Categories].MEMBERS ON 1
FROM [Adventure Works]

So, some recommendations that follow on from this:

  • If calculation performance is a problem for you, be very, very wary of writing queries that have a WITH clause. Where possible, put the calculated members you need on the cube; even if you don’t want other users to see them you can always set Visible=false.
  • When evaluating client tools look out for those that create session or query scoped calculated members for their own internal use (running a Profiler trace will show you whether they do). This could have the side-effect of reducing query performance on calculation-heavy cubes.

8 thoughts on “Formula Caching and Query Scope

  1. Thanks, Chris! Having good cached times is important here, and I was oblivious to the fact that the addition of any “WITH MEMBER dummy as [anything]” to the query was disabling the global cache.

Leave a Reply to “Multi-threading” the Sql Server Analysis Services Formula Engine II-a parallel query msmdpump proxy | Rui Quintino BlogCancel reply