When are named sets in the WITH clause evaluated?

Following on from my postings of last week about dynamically-generated sets, I had a really interesting email yesterday from my colleague at Solid Quality Mentors Francesco De Chirico. He sent me a variant of one of my queries which worked when he wasn’t expecting it to work and to be honest when I looked at it, I couldn’t work out why it worked either. However after a bit of testing I realised he’d discovered something quite important that I at least didn’t know; as with my last post on sets in the MDX Script it all makes sense when you think about it though (see also Mosha’s comment on my last post for some explanation).

Consider the following query:

with
set mycheapset as {[Department].[Departments].&[2],[Department].[Departments].&[3]}
select {} on 0,
mycheapset on 1
from [Adventure Works]

It returns in a second on a cold cache on my laptop, as you’d expect. Now consider the same query with an extra, expensive set declaration in it that isn’t referenced in the query:

with
set myexpensiveset as bottomcount(
{[Department].[Departments].&[2],[Department].[Departments].&[3]}
*
{[Scenario].[Scenario].&[1],[Scenario].[Scenario].&[2] }
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
, 10,[Measures].[Amount])
set mycheapset as {[Department].[Departments].&[2],[Department].[Departments].&[3]}
select {} on 0,
mycheapset on 1
from [Adventure Works]

This now executes on a cold cache in 13 seconds on my laptop despite returning exactly the same results. Even a query which doesn’t reference any of these sets executes in a minimum of 13 seconds:

with
set myexpensiveset as bottomcount(
{[Department].[Departments].&[2],[Department].[Departments].&[3]}
*
{[Scenario].[Scenario].&[1],[Scenario].[Scenario].&[2] }
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
, 10,[Measures].[Amount])
set mycheapset as {[Department].[Departments].&[2],[Department].[Departments].&[3]}
select {[Measures].[Internet Sales Amount]} on 0
from [Adventure Works] 

As with named sets in the MDX Script, named sets in the WITH clause are being evaluated when the query runs regardless of whether they’re referenced or not.

This has obvious implications for performance tuning: if you’ve got an MDX query which has a lot of expensive set definitions in the WITH clause that aren’t used, then you really need to delete them! However you might have queries in custom-built applications or something like SSRS where sets are present which may or may not be necessary, depending on certain parameters. Here’s an example of the type of thing I’m talking about:

with
set myexpensiveset as extract(bottomcount(
{[Department].[Departments].&[2],[Department].[Departments].&[3]}
*
{[Scenario].[Scenario].&[1],[Scenario].[Scenario].&[2] }
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
, 10,[Measures].[Amount]), [Department].[Departments])

set mycheapset as {[Department].[Departments].&[2],[Department].[Departments].&[3]}

member measures.test as
iif([Department].[Departments].currentmember is [Department].[Departments].&[2],
count(myexpensiveset), count(mycheapset))

select {[Measures].test} on 0,
[Date].[Calendar Year].members on 1
from [Adventure Works]
where([Department].[Departments].&[3])

In this case the WHERE clause might be parameterised and could contain either Department 2 (which would return the count of the expensive set) or Department 3 (which would return the count of the cheap set). You can rewrite this to be much more efficient by taking advantage of the fact that sets are evaluated after the WHERE clause so you only evaluate the expensive set when you need to:

with
set myiifset as
iif([Department].[Departments].currentmember is [Department].[Departments].&[2],
extract(bottomcount(
{[Department].[Departments].&[2],[Department].[Departments].&[3]}
*
{[Scenario].[Scenario].&[1],[Scenario].[Scenario].&[2] }
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
, 10,[Measures].[Amount]), [Department].[Departments]),
{[Department].[Departments].&[2],[Department].[Departments].&[3]})

member measures.test as count(myiifset)

select {[Measures].test} on 0,
[Date].[Calendar Year].members on 1
from [Adventure Works]
where([Department].[Departments].&[3])

3 thoughts on “When are named sets in the WITH clause evaluated?

  1. Hello Chris,
     
    I reference you to the book "MDX solutions with MS SQL Server …". I think you know this book, you\’re one of its authors…
    In the fourth chapter, page 99, you describe the execution stages of a query. The stages, in order are:
    1. From
    2. Where
    3. Named sets (from the With clause)
    4. The tuples on each set
    5. Calculating the cells brought back in the axis intersections
     
    When I first read this I understood that the third step calculating all the named sets so I don\’t understand what is the new think in this post. As Mosha wrote: If it wasn\’t this way, the calculating order could affect the results.

  2. I know, it sounds obvious but as I said in the previous post I was under the impression that sets were only evaluated the first time they were used. I\’m sure that was the case with AS2K… As a result it has performance tuning implications that I hadn\’t thought through.
     
    Mosha\’s comment refers to something slightly different, the fact that the position of set declarations in the script is important – you can have the same set definition at two different points in the script and it can return different results. But what you say about the order of execution within a query is also relevant here.

Leave a Reply to Miky SchreiberCancel reply