Madison and SSAS?

On Monday Microsoft announced "SQL Server Fast Track", a set of reference architectures for data warehousing with SQL Server. This has all been blogged very well by others, so if you’re interested in finding out more I suggest you read Peter Koller:!68755AEAC31F9A6C!1022.entry
and Curt Monash:

A couple of accompanying white papers have also been released, though, and I was reading this one:
When I noticed the following statement:
Project code name "Madison" is the upcoming Microsoft scale-out solution for very large data warehouses (VLDW). Madison is based on the MPP technology developed by DATAllegro and the proven SQL Server 2008 database. Madison expands the DATAllegro hub-and-spoke solution to include not only MPP appliances but also standard symmetric multi-processing (SMP) instances of SQL Server 2008 and SQL Server Analysis Services, (SSAS), allowing either to be viewed as nodes within a grid.

and also:
With the upcoming release of Madison, MPP scalability and grid connectivity can be taken to a new level. Madison expands the DATAllegro hub-and-spoke solution to include not only MPP appliances but also standard SMP instances of SQL Server 2008 and SSAS to be viewed as nodes within a grid. A grid of SMP databases and MPP appliances can be used as the basis for any large-scale data warehouse environment or architecture. However, it is particularly suitable for a hub-and-spoke architecture.

So Analysis Services is clearly going to be supported as part of Madison somehow, as a ‘node within a grid’. What does this mean exactly? I’m not really sure. The focus of the paper is the ‘Hub and Spoke’ architecture and how Madison will enable this through its ability to transfer large amounts of data quickly via ‘high speed, parallel transfers’ over its grid. The following DATAllegro white paper offers some more detail on this:

Maybe I’m reading too much into the specific references to SSAS above, but it does seem like something is afoot with Madison and SSAS even if it is just that we’ll get a quick way of moving SSAS databases around. I suppose we’ll find out soon enough…

SQL Server Conference in Italy

I’m doing some work with Marco Russo and Alberto Ferrari at the moment, and for the benefit of any Italian (or Italian-speaking) readers of this blog I thought I’d mention that they are involved in organising a SQL Server conference near Milan. You can find out more and register here:

It looks like they’ve got a lot of good BI/Analysis Services content…

Implementing Real Analysis Services DrillDown in a Reporting Services Report

Sean Boon recently blogged about an approach to implement drilldown on charts with Reporting Services when Analysis Services is used as the data source, and it got me thinking about ways to implement drilldown in Reporting Services in general. There are two standard methods used to do this that are widely known about:

  • The first can be described as "fetch all the data you’re ever going to need to display and then hide the stuff that hasn’t been drilled down on yet" – this article describes it well, albeit for SQL data sources. It’s easy to implement but has has big problem: if the amount of data your report could ever possibly display is massive then the report will be very slow to run, for example if your dataset query returns millions of rows.
  • The second is more scalable, in that you have multiple reports for each level of granularity you want to display and when you drill down or drill up you click on a member in the Report and pass it as a parameter to another report. This also works well but has a different problem: you now have multiple copies of what is essentially the same report to maintain and keep in synch. This approach can also only display one level of granularity at a time, and sometimes it’s nice to be able to see multiple granularities in the same report.

Wouldn’t it be good to have drilldown capabilities in Reporting Services just like you have in any other Analysis Services client? That’s to say, you’d see a list of members on rows in your report, you’d click on one and then see all its children, then click again and its children would disappear? Well, it is possible and it’s a problem I’ve tackled numerous times myself. The last time was when I was writing the samples for Intelligencia Query, but I’ve just come up with an even better approach which I thought I’d blog about. I’ve implemented it for the standard Analysis Services data source although I’ll be honest it took me a few goes to get it to work properly (there would have been much fewer hacky workarounds if I’d been using Intelligencia Query!) and I’m not sure it’s 100% robust; hopefully someone will find this useful though.

What I’ve done is basically a variation on the second approach above, but instead of using multiple reports I’ve created a single report which calls itself when you drill down on a member. The really tricky part is how you manage the set of members you’ve drilled up and down on, and this is where I’d struggled in the past – the solution I’ve got here uses a hidden parameter to manage that set, which is then passed to the main dataset and used with the DrillDownMember function.

Here are the steps to get it working:

  1. Create a new Reporting Services report with a data source pointing to Adventure Works.
  2. Create three new report parameters in this order:
    1. MemberClicked – tick "Allow Blank Values" and set the default value to [Customer].[Customer Geography].[All Customers]. This parameter will hold the unique name of the member the user clicked on to drill down.
    2. PreviousDrillDowns – again tick "Allow Blank Values" and set the default value to [Customer].[Customer Geography].[All Customers], and tick "Allow Multiple Values". This parameter will hold the list of members the user drilled down on before the last drill down.
    3. DrillDowns – again tick "Allow Blank Values" and tick "Allow Multiple Values". This parameter will hold the complete list of members drilled down on for the current report.
  3. Create a new Dataset in the report called DrillDowns. Use the following MDX for the query:

    [Customer].[Customer Geography].CURRENTMEMBER.UNIQUENAME
    UNION({[Customer].[Customer Geography].[All Customers]},
    ISLEAF(STRTOMEMBER(@MemberClicked)), STRTOSET(@PreviousDrillDowns),
    //DRILL UP
    FROM [Adventure Works]

    What this does is take the set of previously drilled down members, and if the member we’ve just drilled down on is not in there return the set of all previously drilled down members plus the new member (for drilling down); if it is present, return the set of all previously drilled down members except the new member (for drilling up). If the member we’ve clicked on is a leaf member, we can ignore the click and just return the set of all previously drilled down members.

    You’ll need to hook up the two parameters @PreviousDrillDowns and @MemberClicked to the report parameters you’ve previously declared. To do this, first of all in the query designer declare the parameters but just fill in the names and a default, such as [Customer].[Customer Geography].[All Customers] (see here, towards the end, for more detailed steps). Then exit the query designer but stay in the Dataset Properties dialog and create two dataset parameters with the names PreviousDrillDowns and MemberClicked and hook them up to the appropriate report parameters.

  4. Go to the report parameter called DrillDowns and set the default value to be the CUSTUNAME field from the dataset you’ve just created.
  5. Create a second dataset called DisplayQuery with the following MDX:

    Space([Customer].[Customer Geography].CURRENTMEMBER.LEVEL.ORDINAL) +
    [Customer].[Customer Geography].CURRENTMEMBER.NAME
    [Customer].[Customer Geography].CURRENTMEMBER.UNIQUENAME
    DRILLDOWNMEMBER({[Customer].[Customer Geography].[All Customers]},
    StrToSet(@DrillDowns, CONSTRAINED), RECURSIVE)
    FROM [Adventure Works]

    This query simply displays the measure Internet Sales Amount on columns, and on rows uses the DrillDownMember function to drilldown on the All Member on Customer Geography plus any other visible members that are present in the set returned from the DrillDowns parameter.

    Once again you’ll have to hook up the @DrillDowns parameter to the DrillDowns report parameter.

  6. Now, in the report, create a table and bind it to the DisplayQuery dataset. Only use the CUSTNAME field to display the members for the Customer Geography hierarchy on rows – this means you have a single field that can contain members from all levels of the hierarchy.
  7. Finally, open the Textbox Properties dialog for the cell bound to the CUSTNAME field and set up an Action to jump to the report we’re currently building. We also need to pass two parameters: one that sends the value of the CUSTUNAME field (note this is the unique name of the member clicked on, not the CUSTNAME field which is just the member name) to the MemberClicked parameter, and one that send the value of the DrillDowns parameter to the PreviousDrillDowns parameter. It’s not actually obvious how to pass the values of a multi-value parameter through an Action, but I found the answer here; the expression you’ll need to use for this report is:
    =Split(Join(Parameters!DrillDowns.Value, ","),",")

Here’s what you’d expect to see when you first run the report:


Click on Australia and then South Australia and you get this:


Click on Australia again and you’d go back to what’s shown in the first screenshot.

I realise these steps are pretty complex, so I’ve created a sample report in SSRS2008 format and uploaded it to my SkyDrive here:

I dream of the day when SSRS will do all this stuff for me automatically…

UPDATE: you can now view the sample report online (minus the indenting for members on different levels, for some reason) here –

SQLBits IV Registration Open!

Registration for SQLBits IV (the UK’s – and perhaps the world’s – largest free SQL Server tech conference), which will be taking place on March 28th in Manchester is now open:

We’ve got four tracks of top-notch presentations including some very strong BI sessions. I’ll be speaking, and among other speakers we’ve got SSIS-superstar Jamie ‘twoblogs’ Thomson for the first time. You can see the agenda here:

I’m also doing a pre-conference seminar "Introduction to MDX":
It’s basically day one of the MDX training course that I’ve run successfully as a private course for the last few years (see We’ll be covering the basics of MDX – sets, tuples, members, popular functions, right up to building the most common types of calculated member; we won’t be covering any advanced stuff like MDX Script assignments or performance tuning. So if you’ve always meant to learn MDX but been thoroughly confused by it, come along!

Songsmith and Data Audiolization for BI

Data audiolization is clearly a real subject that someone, somewhere is researching… and after the fad for data visualisation, why shouldn’t we be thinking about how to represent data with sound? Anyway, I’ll cut to the chase. This video has been doing the rounds on Facebook, it made me laugh and if I didn’t have a hundred better things to be doing I’d be downloading a copy of Microsoft Songsmith right now and working out how to hook it up to Analysis Services:


Adventure Works the musical, anyone?

More on Oracle 11g and MDX

Following on from reports last year that Simba Technologies had built a 2005-flavour OLEDB for OLAP provider for Oracle’s OLAP option, here are some more details:
and here’s a slide deck on it:

One other interesting point made on their slides is that they’re planning to do the same thing for Cognos and SAP/Business Objects too.

PASS European Conference 2009 and Analysis Services Monitoring

Last November, at the PASS Summit in Seattle, I presented a session on building a monitoring solution for Analysis Services, Integration Services and Reporting Services which seemed to go down pretty well. I was lucky in that the SQLCat team presented a very similar session, although just covering Analysis Services, the next day – so at least I got to present first! Anyway, I see that they’ve just got round to publishing their material on this subject here:

Meanwhile, I’ll be working on expanding my material (which was a bit rough-and-ready) into a full day pre-conference seminar which I’ll be presenting with Allan Mitchell at the PASS European Conference, on April 22nd-24th in Neuss in Germany:

Allan, being the SSIS expert, will be covering that side of things and rewriting my packages so they’re rather more robust; that will allow me to concentrate on the SSAS/SSRS side of things, which I know better. We have a vague plan to release all of our code on Codeplex or somewhere similar; I know a lot of people are also interested in this area.

With a bit of luck I’ll also be speaking at the main conference, but I don’t think the full agenda hasn’t been decided yet. I had a good time there last year and hopefully I’ll see some of you there this year too!

Arbitrary-shaped sets and the Storage Engine cache

Here’s a companion-piece to my post last week about query context and the formula engine cache – another scenario where you can easily stop caching taking place without knowing it, which has already been documented (although there is at least one important point to add) but again needs more visibility. This time the problem is that when you write an MDX query with an arbitrary-shaped set in the Where clause it stops Analysis Services using the storage engine cache. Queries that suffer from this will always read from disk and always perform as well or as badly as they did the first time they were run – so if cold cache performance is a problem for you, then this is an issue you need to understand and avoid.

Rather than repeat the information, let me direct you to the blog entry where I first found out about this problem, on Thomas Keyser’s blog from 2006:!12BCB785A5D8B3D4!135.entry

I can confirm that everything he says is still relevant on SSAS2008 except for the last query, where he has the whole of the Product.[Product Categories] hierarchy in the Where clause – run it twice and the second time you run it you’ll see it does hit the storage engine cache. One other point I picked up on Mosha’s MDX seminar in November is that it is possible for Analysis Services to think a set is arbitrary-shaped when it really isn’t. Take the following query:

SELECT [Measures].[Internet Sales Amount] ON 0,
[Date].[Calendar Year].MEMBERS ON 1
FROM [Adventure Works]
({[Product].[Category].&[1], [Product].[Category].&[3]}

This does not have an arbitrary-shaped set in the Where clause, and as a result the second time you run it you’ll see it hit the storage engine cache. However, if you rewrite the query so you have a set of tuples in the Where clause as follows:

SELECT [Measures].[Internet Sales Amount] ON 0,
[Date].[Calendar Year].MEMBERS ON 1
FROM [Adventure Works]
({([Customer].[Country].&[Australia], [Product].[Category].&[1]),
([Customer].[Country].&[Australia], [Product].[Category].&[3])})

Even though you might think this query is equivalent to the first one, you’ll see that it does not use the storage engine cache.

What can we do about this then? Not a lot with most client tools; I’ve not checked, but I’d be surprised if any of them generated their MDX to avoid this situation. If your users frequently use certain arbitrary-shaped sets the only thing you could maybe do is hack your dimension data to make them non-arbitrary – but that would almost certainly end up being a bad compromise; otherwise you’d just have to build aggregations to make cold cache queries fast.

However, if you’re using SSRS then of course you can rewrite the MDX yourself. Let’s build a quick report on AdventureWorks that displays this problem:


As you can see, I’ve got a multiselect parameter on the slicer that has a default selection of members from two different levels from [Product].[Product Categories] – an arbitrary shaped set. Here’s the MDX that gets generated:

{ [Measures].[Internet Sales Amount] }
{ ([Date].[Calendar Year].[Calendar Year].ALLMEMBERS ) }
SELECT ( STRTOSET(@ProductProductCategories, CONSTRAINED) )
ON COLUMNS FROM [Adventure Works])
IIF( STRTOSET(@ProductProductCategories, CONSTRAINED).Count = 1,
STRTOSET(@ProductProductCategories, CONSTRAINED),
[Product].[Product Categories].currentmember ) )

And here’s how I would rewrite it:

{ [Measures].[Internet Sales Amount] }
{ ([Date].[Calendar Year].[Calendar Year].ALLMEMBERS ) }
[Adventure Works]
STRTOSET(@ProductProductCategories, CONSTRAINED),
[Product].[Product Categories].LEVELS(
MAX(STRTOSET(@ProductProductCategories, CONSTRAINED),
[Product].[Product Categories].CURRENTMEMBER.LEVEL.ORDINAL))

What I’ve done here is:

  • Get rid of the completely useless Non Empty on columns (Why is it there? We don’t even want to remove empty columns – surely doing that would break the report?)
  • Removed the subselect and used a Where clause instead to do the slicing, so if we needed it we could use the formula engine cache (see here for why).
  • Removed any cell or dimension properties that aren’t used by the report (see here for why, although it’s only relevant for really big reports).
  • Used an expression in the Where clause to find the descendants of all members in the parameter set at the level of the lowest member of the set, using the Descendants, Levels and Max functions. This I think will turn all arbitrary-shaped set selections into non-arbitrary-shaped sets.
%d bloggers like this: