Things I’d like to see in Analysis Services 2011

It’s that time in the release cycle. I know what the new features are going to be for Analysis Services (for a full list see here) and it’s rather like Boxing Day: I’ve just got my presents and I’m starting to wonder to what Santa will bring on his next visit. And to a certain extent I’m feeling as though I should have been a bit more specific when I was writing my Christmas list last time, given that I’m a teeny bit underwhelmed by the features I did get in AS2008. So here, for the benefit of the elves working away in Redmond, is a list of things I’d love to see in Analysis Services 2011 or whatever it will be called:

  • Auto-partitioning
    Building and maintaining partitions is a job that almost everyone needs to do and yet it’s something that involves way too much hard work at the moment. As such it’s an ideal candidate for some form of automation, and indeed a while back Jamie Thomson posted the following suggestion on Connect: Other possibilities include the use of data mining to determine how you should set the slices for your partitions (which I blogged about a long time ago) and something like the MDX driven auto-partitioning that Greg Galloway implemented for the Analysis Services Stored Procedure Project.
  • Do away with the arbitrary shaped sets error
    This is a particular bugbear of mine. The more I work on cubes with complex calculations, the more I find that I want to scope calculations on arbitrary shaped sets, which of course I’m not allowed to do. That means I end up making several identical assignments to get the same result, which leads to way too much duplicate code and a maintenance nightmare. If I can break up an arbitrary shaped set into something acceptable to Analysis Services, why can’t Analysis Services do this itself? Here’s the Connect I logged on this:
  • Fix parent-child hierarchies
    Parent-child hierarchies are a pain. For a start they cause performance problems so what I would want to see is the ability for AS to create aggregations at intermediate levels in a parent-child hierarchy at the very least; I’m sure there are a lot of other potential engine improvements that could be made for them. Secondly, scoping calculations on them is a real nightmare and on a related note trying to use dimension security with them is an equally thankless task. Given that there are some situations where you have no option but to use them, I think some time needs to be invested in making them better.
  • Build parallelism into the calculation engine
    What with all the block computation improvements in AS2008, I’m finding that many calculations are performing a lot better than in AS2005. However I’m still finding scenarios where the calculation engine is the bottleneck and just one CPU is being used on the server when I’m running a query (something I discussed here). Surely it’s possible to get the calculation engine to do some calculations in parallel when a query is being run?
  • Be able to rename attributes in role-playing dimensions
    If there was one feature I wanted to see in AS2008 it was this. It would have been so easy to do, I know loads of other people wanted it, and I did log it in Connect ages ago: And it wasn’t included. All I want to do is to be able to rename the individual attributes of a role-playing dimension instead of just being able to change the dimension name – for example if I have Order Date and Ship Date role-playing dimensions based on a single Date dimension, then I want my ‘Calendar Date’ hierarchy in Date to appear as ‘Calendar Order Date’ and ‘Calendar Ship Date’ to my end users. Not being able to do this has stopped me using role-playing dimensions on so many occasions simply because seeing multiple hierarchies with the same name is too confusing.
  • Support for cross-dimensional user hierarchies
    Sometimes I find myself in the situation where I’ve got a large dimension, such as a Customer dimension, where the lower-level attributes are queried much less often than the higher level attributes. Now AS2005 handles large dimensions pretty well, in my experience, but it would still be useful to be able to store less useful lower-level attributes in ROLAP mode and more useful higher-level attributes in MOLAP mode. One solution to this could be the ability to set the storage mode on individual attributes rather than the whole dimension but I’m not sure that’s practical; instead it would be great if you could split the attributes up into a ROLAP dimension and a MOLAP materialised reference dimension and be able to build a user hierarchy that spans both. So you’d be able to drill down from Country to City to Post Code to Customer, and when you were querying the first three levels you’d be hitting a MOLAP dimension and when you queried Customer you’d be hitting a different ROLAP dimension. In fact I’m sure the ability to set up cross-dimensional drill paths would be useful in other scenarios too and make cubes much easier to use.
  • Support for partitioned dimensions
    Dimensions are getting bigger and bigger, and I wonder if at some stage it would be useful to be able to partition dimensions as well as measure groups to get better performance and manageability? Just a thought.
  • Be able to manage unary operators and custom member formulas from the MDX Script
    Perhaps this is just me, but sometimes I find it’s a bit of a pain when you’re using unary operators and custom member formulas that their values have to be stored in the dimension table. Can’t we have the option to be able to set or override a unary operator or custom member formula from the MDX Script – after all, that’s meant to the one central repository for all my calculations. I wonder if this is possible anyway with an ALTER CUBE statement? Hmm, I don’t think so.
  • New features for calculated members
    As I’ve mentioned here before, quite often I see people creating real members in dimensions then overwriting their values with scoped assignments when I’m sure they would be better off with using calculated members. So to meet that and other challenges, here’s a list of things it would be cool to allow with calculated members:
    • Allow calculated members to have other calculated members as children
    • Be able to control where calculated members appear in a hierarchy
    • Allow calculated members to have member properties, either returning static values or the results of MDX expressions
    • Following on from that, if calculated members can have calculated member property values, real members should have that too. For example, on a Customer dimension I might know a date of birth but I might also want to be able to calculate a customer’s age and show that as a property
    • Following on from that, it would be great to be able to create entire hierarchies based on calculated members or properties. So if I can calculate a customer’s age, and then have another calculated property that placed that customer into a bucket based on their age (eg 0-9, 10-19, 20-29 and so on), I’d also like to be able to determine the number of buckets based on some MDX as well (maybe using server parameters – see below) and create a user hierarchy where the top level would consist of calculated members representing these buckets and where I could drill down from these buckets to the individual ages and then down to the real customer members themselves.
  • Server parameters for MDX calculations
    The ability to parameterise an MDX query is cool, but it would also be great if you could also parameterise calculations on the server. For example you could set up a kind of server-wide variable which was the rate of Income Tax, do calculations using that value and then allow users to have the option of changing its value and explore ‘what-if’ scenarios. This would clearly have an impact on AS’s ability to cache calculations but so long as people knew about this then I’m sure it would be ok. These parameters would also be visible through the metadata so client tools would know to show users they could change them.
  • Be able to define either whole queries or ‘selectable chunks’ of MDX on the server
    After my posting on how dynamic sets actually work in AS2005 a few months back, I got to thinking about why I was disappointed and what I really wanted to see. The problem is that however well you model your cube, and however easy-to-use your client tool is, there’s always a gap between what the tools are capable of and what the user is able to accomplish with them. For example, you might know that your end users want to be able to create reports showing the top 10 customers with an extra value underneath that represents the aggregated values of all other customers outside the top ten. This is certainly possible in MDX and perhaps there are client tools out there that support this, but wouldn’t it be cool if you could create this as a kind of pre-canned selection that was parameterisable (eg show the top 20 instead of the top 10, or show products not customers) and visible through the metadata so any client tool could display it to the end user and allow it to be used? This would be a selection that would be designed to work wherever the user placed it in his or her query; it’s not the same thing as a dynamic set, more of a parameterisable custom set function. The concept could be extended from pre-canned selections to entire queries too – and if I haven’t convinced you (or been entirely successful in explaining) the concept of pre-canned selections then I think the case for entire queries stored on the server is much easier to make. Again these would be parameterisable and discoverable through metadata, so that any client tool would (hopefully) be able to use them; for Reporting Services alone this would be extremely useful, so you could have one query displayed in many reports with a single point of maintenance. I’m sure there would also be an opportunity to introduce some performance benefits for these stored queries too, perhaps in terms of being able to cache entire cellsets rather than just the values of individual cells.
  • Better management tools for security
    As I mentioned before here, the tools we’ve got at the moment are not up to the task of managing large numbers of security roles and some serious work needs to be done in the UI. It would also be great if the tools supported dynamic security and the creation of all the supporting objects, perhaps through a wizard.
  • Better support for Analysis Services in SQL Management Studio
    This is an easy one to specify: I want all of the functionality in MDX Studio put into SQL Management Studio. And a hundred other little things, like in the metadata pane I want to be able to open a level and expand a member to drill down as well instead of always having to start at the All Member of a hierarchy; oh, and I’d like to have a Reconnect button for when I’ve got a query and I’ve killed my session because of cube changes, instead of having to disconnect and reconnect. And the ability to display…
  • Query execution plans
    A lot of people have been asking for this already; here’s the Connect: This would be soooo useful in performance tuning.
  • Support for other RDBMSs
    Analysis Services is useful not only as a database engine in its own right, but as a query/calculation layer on top of other database engines. OK no-one uses ROLAP on SQL Server because performance is rubbish compared to MOLAP, but for years people have been experimenting with using AS in ROLAP mode on top of Teradata (see here) and I’m sure that it could be used in a similar way on top of other data warehouse appliances or the new generation of COP databases like Vertica. You’d be selling it as a way of OLAP-ifying these databases, getting the incredible scalability and performance they (allegedly) offer in combination with the power of MDX for calculations and access to the data via Excel. Writing new cartridges for these databases should be pretty easy; I suppose the problem comes with trying to generate SQL optimised for each of these platforms.
  • Writeback with no allocation
    MOLAP writeback in AS2008 certainly does improve performance, but the need always to allocate your values down to the granularity of the fact table is always going to lead to a performance hit. Wouldn’t it be useful if you could write a value back to a cell without having to allocate it down always? You’d get instant writeback. I’m sure that would be useful in many cases, and in fact it would work in the same way as if you’d assigned a value to that cell in the MDX Script. I wrote some code in the Analysis Services Stored Procedure Project that did this, in fact, although it wasn’t really anything more than a proof-of-concept exercise.

Anyway, this post has gone on quite long enough now, although I’m sure if I thought about it I could go on for even longer. I need to do some work! But what would you like to see in Analysis Services 2011? Answers in a comment, please…

Currency formats: should they be tied to language?

One of the most commonly asked questions on the AS MSDN Forum is how to format measures that contain values in different currencies with the correct currency symbol. I’ve never blogged about this because a lot of people have already written up the solution in detail, for example Mosha:

and Vidas:

However I was thinking about this recently and in my opinion there’s a big problem with using the Language property to do this. And that is that when you set the Language of a cell, you not only change the currency symbol but you also change other ways that the number is formatted, for example the symbols used as thousands separators and decimal separators. In the US and UK of course, we use full stops (I think they’re called ‘periods’ in the US?) as decimal separators and commas as thousands separators, but in continental Europe the roles are reversed. So the value:
would be interpreted as one hundred thousand and one in the UK, but one hundred point zero zero one in Germany, say, and the value:
would be interpreted in the opposite way. Borrowing one of the screenshots from Vidas’s post you can see how the Language property respects these conventions:


So you can see what the potential problem is – what happens if you have values in Euros, USDs and GBPs in your cube? However much you educate your users you can guarantee that someone at some time is going to get confused or worse not realise what’s going on and interpret the values incorrectly.

What’s the alternative then? I think using Format_String has to be the way to go. If you alter Vidas’s example so that instead of locale ids you put currency symbols inside the currency dimension named query, for example:

SELECT     CurrencyKey, CurrencyAlternateKey, CurrencyName,
                      CASE CurrencyAlternateKey WHEN ‘GBP’ THEN ‘£’ WHEN ‘EUR’ THEN ‘€’ WHEN ‘JPY’ THEN ‘¥’ WHEN ‘USD ‘ THEN ‘$ ‘ END AS LocaleID
FROM         DimCurrency
WHERE     (CurrencyKey IN
                          (SELECT DISTINCT CurrencyKey
                            FROM          FactCurrencyRate))

and then change his MDX assignment to be something like:

SCOPE ([Destination Currency].[Destination Currency].[Destination Currency].Members);
Format_String(This) = [Destination Currency].[Destination Currency].[Symbol].MemberValue + "#,#.00";

Then you get the desired result. However, one thing I did notice when I was experimenting with this is that if you try to use more than one character for your symbol (for example you might want to use CHF for Swiss Francs) you sometimes get the following error:

#Error The following system error occurred:  Out of present range. .

Not good. Here’s the bug logged on Connect:

Following on from all this, it also makes sense that users connecting from different locales automatically see numbers (but not currency symbols) formatted in the convention of their locale. So a German person might connect to the cube and see Euros with a € and USDs with a $, but see commas used as decimal separators, whereas a user in the UK would still see €s and $s with the correct symbol but full stops used as decimal separators. Now AS2K I seem to remember used to be able to handle this perfectly well – it could display the appropriate decimal separator and thousand separator depending on the client locale. However AS2005 RTM had a problem in that it worked ok for calculated measures but not for real measures; this was ‘fixed’ in SP2 so both calculated measures and real measures always got displayed in the locale of the server. American software, eh? And to think that so many members of the dev team are from Europe too. Here’s the Connect:

and here’s a forums thread on the subject:

Hmmmmm…. I need to check if this has been fixed properly in Katmai.

Interview with me on Cristian Lefter’s blog

While I was at PASS Europe last week Cristian Lefter videoed an interview with me on what I think is cool in AS2008, for the Romanian SQL Server user group. You can watch it on his blog here:
He also interviewed a load of other people, including Marco Russo.


Dave Wickert of Microsoft has asked me to relay the following message about a new version of ascmd he’s working on:

“In preparation of SQL Server 2008 Analysis Services RTM’ing, I am starting a small beta program for the next version of ‘ascmd’. This version will have two important new features:

1)      Built and verified against Analysis Services 2008 – but also directly compatible with AS2K5 – thus I need both AS2K8 and AS2K5 participates.

2)      New multi-user capabilities – the Microsoft SQLCAT team uses ascmd as the multi-user load facility for its Analysis Services benchmarks. Ever want to load up a few hundred users against a SSAS server? We are going to be retrofitting those capabilities (plus a few others) back into the released version of ascmd.


I am looking for users to test out this new version: 1) to, at least, slipstream this version into their current use to see if we’ve broken anything (we don’t think so, but we want to fix anything if we did), and 2) test out the new features if you can.


We are also actively soliciting new features and improvements for ascmd at this time – if we can sneak them in, we will.


The “readme” for the current version of ascmd is located here:


If you are an avid user of ascmd and would like to be included in this beta, please contact me directly (”

Distinct Count White Paper

Yet another excellent paper on optimising distinct count measures from the SQLCat team:

Actually I’m beginning to wonder whether I should be linking to the SQLCat team site – in the same way I never link to Mosha because I assume that everyone who reads my blog reads his too, then I would hope everyone subscribes to the SQLCat team blog as well.

One topic missing is a comparison of the performance of distinct count measures with the technique of using many-to-many dimensions to get the same result that Marco Russo describes in his famous m2m white paper:

Marco presented on this at PASS Europe and mentioned (which tallies with my experience) that this approach can perform as well as, and sometimes better than, a distinct count measure.

Panorama, Google and Analysis Services Support

New developments on the Panorama/Google front: users of Panorama’s pivot table for Google docs will be able to connect direct to Analysis Services via http (if they want to pay) and also upload local cube files and query them (which will be free). Here’s the announcement:

This comes hot on the heels of Google’s Salesforce link-up:

They’re slowly joining the dots…

I was asked not to talk about this until the press release went out, but then I found that Nigel Pendse has already beaten me to it and done an excellent job of summing everything up:

I agree with Nigel that I don’t think there will be many people at the moment who will be willing to expose their SSAS server via http. However there are a small number of companies who provide hosted SSAS solutions and this I think will become a much more popular option over the coming years. I do think, though, that the local cube option has some very interesting possibilities especially because it’s free. I can imagine it would be a great way of distributing data to disconnected employees like salespeople, or to partners, for small companies on a tight budget. Tim Peterson must be pleased – this should cause a resurgence of interest in local cubes.

PASS Europe and Webdashboard

I just got back from Germany after attending the PASS Europe conference. It was good fun as always and the standard of presentations was extremely high. Thanks are due to André Kamman and everyone else who organised it for making it such a success. Hopefully we’ll have another one next year? Bill Graziano has a picture (the one on the right) of me, Marco Russo, Allan Mitchell and various others here:

Both Marco Russo and I were impressed by Webdashboard, who were exhibiting:
You’ll probably follow the link and think it’s nothing special. In technical terms it isn’t, it’s a web-based dashboard that displays data from Analysis Services; it’s simple to use, quick to set up and runs pretty quickly. How do they expect to compete with PerformancePoint then, you ask? Well, have you seen just how much PerformancePoint costs? It may be cheap compared to other non-MS BI solutions, but in the MS BI world it’s very expensive. Webdashboard is just €95 per end user and for companies on a tight budget who don’t need a complex dashboard tool I think it makes for a compelling alternative.

There will be a SQL2005 SP3!

A triumph for people power! Microsoft have announced that there will be a SP3 for SQL2005 and it will be delivered this calendar year. More details here:
There’s also been yet another cumulative update released, CU7:
UPDATE: there’s a blog entry from the guys in PSS on how CU7 changes what happens when backups and processing overlap, here:

Announcing Intelligencia Query for Reporting Services

Can I be utterly shameless and use my blog to promote products I have a commercial interest in? Of course I can! But first, let me tell you a story…

Last December I spent a lot of time thinking about the problems of building Reporting Services reports on top of Analysis Services. To be honest, this issue has been a constant irritation for me ever since I first saw Reporting Services about five or six years ago; this blog entry, one of the most popular I’ve ever posted, sums up my thoughts on the matter:!1pi7ETChsJ1un_2s41jm9Iyg!163.entry
and I’ve blogged about specific workarounds and performance issues on a regular basis ever since. But anyway, as I was saying, last December I was working with a customer new to the MS BI stack who had announced their intention of using SSRS and SSAS together in the naive assumption that because these tools were produced by the same company they would work well together, and for the n-hundredth time I found myself having to manage their expectations and teach them the various tips and tricks necessary to get their reports working. I also wrote an article on this topic for SQL Server magazine that came out in December and I expanded the same material into a presentation for the last SQLBits (you can download the slide deck here); it was also around that time that I realised that nothing was going to change in Reporting Services 2008 (see this thread on the MSDN Forum for other people’s reactions to this). All this got me thinking.

The way you’d want SSRS to work is fairly clear: you’d want a decent query builder and the option of using any MDX query you’d care to write, and you’d want SSRS to render the resulting cellset in a matrix-like control automatically. It should work like pretty much like every SSAS client tool on the market only the output should be a SSRS report. The reason you can’t do this is also clear: SSRS needs to know the name and number of columns your query returns in advance, and you can’t do that with MDX queries. So how can you get around this limitation?

Then it hit me. Instead of having to unpivot your MDX query manually so that it’s in the format that SSRS likes, why couldn’t you have a custom data extension that did the hard work for you? So I wrote my first attempt in an afternoon and after a lot more polish it now works really well. Here’s an example of what it does:

Take the following query:

select {[Date].[Calendar Year].&[2003],[Date].[Calendar Year].&[2004]} on 0,
{[Product].[Category].&[4],[Product].[Category].&[1]} on 1
from [Adventure Works]
where([Measures].[Internet Sales Amount])

This returns the following SSRS-unfriendly results:


But when run through my custom data extension the data is returned as follows:

You’ll see that all of the columns have become rows, and there are two columns containing the Value and Formatted_Value properties of the cells. The cool thing is now that you can put a matrix control in your report, map the ‘column’ fields onto the matrix column groups, the ‘row’ fields onto matrix row groups and one of the cell properties onto the data area like this:

And when you run your report, the matrix control recreates the original structure of your MDX query:

This is pretty simplistic but even more complex queries with multiple levels of crossjoining on rows and columns are handled just as easily; it gets around all the limitations with the built-in Analysis Services data source/query builder as well as those of the OLEDB data source. Some of the key advantages are:

  • You can use any MDX query and display the results (see here for some of the problems associated with the built-in Analysis Services data source needing measures on columns)
  • You can parameterise your reports by measures (see here)
  • There’s no messing around with ‘aggregate rows’, all rows are treated the same
  • Because the field names (optionally) do not refer to the hierarchy names used in the query, you can write one report that displays the results of very different queries – the only requirement is that you have the same number of rows and columns in your query. For example you can create one report that displays either Years, Quarters or Months on rows and control which one gets displayed using a parameter. It also means that the report design is much more tolerant of changes to the query during development.
  • There’s full support for parameterisation, something you don’t have with the OLEDB data source.
  • It (optionally) indents member captions based on their level depths, so the captions of members on lower levels appear further to the right of captions of members on higher levels.

I was feeling quite pleased with myself at this point and was about to publish the code up to Codeplex when it occurred to me that I could make some money out of this – after all, I do have a wife, two kids and a mortgage to feed. There was also one big missing feature that I knew I didn’t have the time to implement on my own and that was an MDX query builder: whereas I’d be quite happy to write my own MDX queries, I know most people aren’t that comfortable doing so and need some kind of tool to do it for them. So I got in touch with Andrew Wiles of iT Workplace who I knew had an excellent MDX query control, signed a partnership agreement with him and over the last few months we (well, mostly him really) have worked to integrate his query builder with my custom data extension with BIDS to come up with what we think is a product that anyone using Reporting Services and Analysis Services will want to use, Intelligencia Query for Reporting Services. Yes, it’s something you’re going to have to pay for but in my opinion it’s very modestly priced and more importantly it has been saved my work from the fate of being a cool idea implemented by an occasional C# coder and stuck in perpetual beta on Codeplex, and turned into a commercial-quality product that is safe to install on your servers and which will grow and improve over time.

The main selling point of the whole solution is that it gives you, for the first time with Analysis Services, the ability to build in a query builder exactly the resultset you want to see displayed in your Reporting Services report – it’s truly WYSIWYG. Certainly in a lot of cases you can achieve what you want with the native query builder but as I said, with it you always have to try to imagine the end result you want and write a query that can be pivoted into that end result, and that’s not easy for the majority of people. Combine this with the fact that the Intelligencia query builder allows you to build much more sophisticated queries than the native query builder and that, as mentioned above, the custom data extension solves many of the problems you run into with the built-in Analysis Services and OLEDB data sources, and I think you’ve got something that finally makes it easy to build Reporting Services reports using Analysis Services data.

I think Andrew’s query builder is the equal of anything else on the market in terms of the kind of query it can create. It does all the stuff you’d expect like hierarchical selections, it does stuff that few other tools do such as allow for the creation of asymmetrical sets on an axis, and it has at least one outstanding feature that no-one else has in its ability to let users create MDX calculations with an Excel formula-like interface. It even supports SAP BW as a data source too. Here’s a screenshot:


Anyway, enough of me going on. You can find out more, download an eval version, watch a video and buy some licences here:

One last thing: we want to hear your feedback. This is a version 1.0 and while we’ve got a lot of ideas on new features that could be added, we want to be sure they are the features that our users actually want. So even if you only download the eval please drop me a mail to let me know what you like and what you think is missing. It’s likely we’ll be able to add new features pretty quickly.

illuminate, FAST and yet more completely unfounded (and probably ill-informed) speculation on my part

I was just reading this article on TWDI:

It’s on illuminate and their ‘correlation DBMS’ – not relational, not OLAP, not COP, but a ‘value-based’ system. All very fascinating indeed, although their web site is yet another one of those irritating ones that leaves you with the feeling that some important technical detail and information on where they’re positioning themselves is missing. This entry on their blog:
is a case in point – what is it exactly that illuminate can do that these other platforms can’t? I guess it’s the kind of query that they talk about here:
…which is certainly not the kind of thing that OLAP is good at, or even meant to be good at.

I might have completely the wrong end of the stick, but didn’t Microsoft get its hands on something similar when it acquired FAST last year (blogged here)? Is this it:
? I’m sure I remember reading about how FAST could speed up DW-style queries by loading all the data into an index; I’d be interested to hear from anyone who can set me straight on this. If I’m right, then perhaps some of the comments I made when talking about COP databases last year about how they would fit into the Microsoft BI stack would also be relevant here.

%d bloggers like this: