A lot of my work involves tuning MDX queries, and over the last few months I’ve come across several situations where calculation-heavy queries take a long time to execute but for the most part only seem to use one CPU on a multi-processor server. When I first saw this I made some enquiries about what was going on and found that this was indeed a feature of the AS calculation engine: only one query can be run per session at any one time, and for that query the work of the formula engine (unlike the storage engine) can’t be parallelised. Hohum.
However, then I realised that there are some scenarios where you can inject parallelism into a query. Consider the following query in Adventure Works:
select [Measures].[Amount] on 0,
bottomcount(
{[Scenario].[Scenario].&[1], [Scenario].[Scenario].&[2]}
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
*
[Department].[Departments].[Department Level 02].members
, 10,[Measures].[Amount])
on 1
from
[Adventure Works]
On my laptop, on a warm cache, it consistently executes in just under 40 seconds; my laptop is dual core and for the vast majority of the query time I see just 50% CPU usage. This is consistent with what I found out about the formula engine since the Account dimension has unary operators . However, if I open two different connections in SQL MS and run both of the following queries simultaneously:
select [Measures].[Amount] on 0,
bottomcount(
{[Scenario].[Scenario].&[1]}
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
*
[Department].[Departments].[Department Level 02].members
, 10,[Measures].[Amount])
on 1
from
[Adventure Works]
select [Measures].[Amount] on 0,
bottomcount(
{[Scenario].[Scenario].&[2]}
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
*
[Department].[Departments].[Department Level 02].members
, 10,[Measures].[Amount])
on 1
from
[Adventure Works]
…both queries finish in 20 seconds and I see 100% CPU usage.
What I then did was create a stored procedure with a function called ParallelUnion which, when passed to MDX set operations like the Bottomcounts in the queries above, opened two separate connections, executed the operations in parallel then Unioned the results and returned a set. Here’s an example of how it worked:
select measures.amount
on 0,
bottomcount(
ASSP.ParallelUnion(
"bottomcount(
{[Scenario].[Scenario].&[1]}
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
*
[Department].[Departments].[Department Level 02].members
, 10,[Measures].[Amount])"
,
"bottomcount(
{[Scenario].[Scenario].&[2]}
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
*
[Department].[Departments].[Department Level 02].members
, 10,[Measures].[Amount])"
), 10, [Measures].[Amount])
on 1
from
[Adventure Works]
To be clear, in this query the sproc executes the two Bottomcount set expressions passed in as strings on two different connections and sessions, and returns a unioned set containing the bottom 10 tuples for both [Scenario].[Scenario].&[1] and [Scenario].[Scenario].&[2], 20 tuples in all. Given that the overall bottom 10 tuples must be present in the 20 tuples the sproc returns I still need to do another Bottomcount but it’s relatively inexpensive. The query above executed in 17 seconds, less than half the time of the original query, and shows 100% CPU usage on my laptop.
I also created a ParallelGenerate function which worked like the existing Generate function in MDX and which would allow for more flexibility in splitting up queries in order to make use of more than two processors. Here’s an example:
select measures.amount
on 0,
bottomcount(
ASSP.ParallelGenerate(
{[Scenario].[Scenario].&[1],[Scenario].[Scenario].&[2] },
"bottomcount(
{[Scenario].[Scenario].currentmember}
*
[Account].[Account].[Account].members
*
[Date].[Date].[Date].members
*
[Department].[Departments].[Department Level 02].members
, 10,[Measures].[Amount])"
), 10, [Measures].[Amount])
on 1
from
[Adventure Works]
My next problem was that although I’d proved what was possible, my code wasn’t exactly robust. Luckily at this point the guys from the Analysis Services Stored Procedure Project helped me out: both Jon Axon and Mosha provided some excellent feedback and Greg Galloway took the time to rewrite it all. I’m extremely grateful to all three for their help, esppecially Greg for all the effort he put in. You can download the code (alond with lots of other useful sprocs) as part of the Analysis Services Stored Procedure Project here:
http://www.codeplex.com/Wiki/View.aspx?ProjectName=ASStoredProcedures
(Note you need to download the latest source code and compile it; the last compiled dll release doesn’t contain these functions)
Now to mention the drawbacks to all this (thanks again are due to Mosha for pointing several of these out):
- It’s only useful when the performance gained through parallelism is more than the overhead of calling the sproc, opening the two connections and running the queries. I very much doubt, for example, it would ever be useful to use these sprocs inside a calculated member; they’re only really going to be useful when you’re writing queries for something like Reporting Services and you can use them in set definitions which are only going to be evaluated once.
- If you need the sproc to open its connections with the same security as the account running the main query, you need to use ImpersonateCurrentUser mode rather than the default ImpersonateServiceAccount. This, though, carries an extra performance hit. There might be a way around this using the Username function and the EffectiveUserName connection string property, but it’s not been tested and isn’t in the code.
- Since the sproc opens up new sessions to run its queries, all session state (things such as VisualTotals, session calculated members and session sets) are lost and this can lead to unexpected and incorrect results.
- Similarly, any query context – things like sets in the Where clause, query-scoped calculations etc – is lost in the call to the sproc, again leading to incorrect results, although to a certain extent you can work around this.
That said, I still think these functions can make a massive difference to query performance in some circumstances. I’d be very interested to hear from anyone who does use them in their queries to know what the impact is.
One thought on “Parallelism and MDX Queries”