Generating large numbers of partitions using Excel

Quite often when I’m doing proof-of-concept type work I find myself in the situation where I need to build a cube with near-production data volumes, and in order to make sure that performance is good I have to partition that cube. However setting up tens or even possibly hundreds of partitions in BIDS manually is no-one’s idea of fun, so how can you automate this process easily? If you’re using a SQL Server datasource then you should try using the functionality built into the Analysis Services Stored Procedure Project:

But what if you’re using an Oracle datasource or hit a bug with the ASSP code? Most consultants have a preferred method (such as their own custom code, or SSIS packages) but I thought I’d blog about the most commonly used approach which is to use Excel. Here are the steps:

  1. Your starting point should be a cube with just one partition in each measure group.I would recommend not putting all your data in this partition, but create it as a slice of the data, the first of the partitions you want to create. So you’ll probably want to make it query-bound rather than table bound, and set the Slice property appropriately (see here for why this is important).
  2. Open up SQL Management Studio and expand the tree in the Object Explorer so you can see the partition for your first measure group then right click and Script Partition As -> CREATE To. This will open a new XMLA window and give you your template XMLA command to create a partition.
  3. Let’s say we are going to partition by month, we have three months and they have surrogate keys 1 to 3. Our template partition is correctly configured for Month 1 and we want to be able to alter this template for months 2 and 3, so we’re going to ‘parameterise’ the following properties:
    1. ID – so, if your existing ID property is set to "Month_1", we need to replace the 1 with a string we can easily find with a search and replace like "@@@", making the new ID "Month_@@@"
    2. Name – which is usually the same as the ID
    3. Query Definition – you will have a Where clause in the SQL query behind the partition which is something like "Where Month_ID=1" and this should be changed to "Where Month_ID=@@@"
    4. Slice – the tuple will be something like "[Period].[Month].&[1]" which again should be changed to "[Period].[Month].&[@@@]"
  4. Copy this XMLA command text into a cell in a new Excel workbook, say cell A1. Make sure you paste the text into the formula bar and not directly onto the worksheet – you want it all in one cell.
  5. Underneath this cell we’re going to use Excel formulas to take this template and generate the XMLA needed for all the partitions we want. In cell A2 enter the value 1, in A3 enter 2 and so on for as many months as you need. Remember in Excel if you enter values like this that increment by 1, if you select that area then drag it downwards Excel will automatically fill the new cells with incrementing values
  6. In cell B2 we’re going to use an Excel formula to replace the string @@@ with the value in A2. So something like the following will work:
  7. You can then copy and drag this formula downwards, and you’ll see all your new XMLA commands to create partitions appear in B3 and the cells underneath
  8. Copy and paste the new XMLA into a new XMLA query window in SQL Management Studio
  9. You may find that some unwanted double-quotes have appeared now. You need to replace the double sets of double quotes ("") with (") and the delete the single sets of double quotes. So first do a find and replace on "" and change it to something like @@@, then do a find and replace to delete all instances of ", then do another find and replace to change @@@ to ".
  10. You now need to wrap these XMLA commands in a batch statement so they can be run together. So paste the following text before the first Create:
    <Batch xmlns="">
    and this at the very end:
  11. Now delete your original partition and execute the new XMLA batch command you’ve just created and hopefully you’ll have your new partitions created. You can then process them.

Post-Holiday Roundup

Ahh, so I’m back from my holiday and feeling much better -even if it did manage to rain every single day while I was away (that’s the risk you take with holidays in England). Now all I have to do is get through the massive pile of emails waiting for me and steel myself in preparation for the next few months of hard work… roll on Xmas! Anyway, a few interesting things that happened/thoughts that occurred to me while I was away…

Of course the big thing that happened, the day after I left, was the RTM of SQL2008. Hopefully you’ve heard this news by now, but the big questions here are: is AS2008 any good? Do I want to migrate, and if so, when? Personally, I’ve been using it for a few months now on a project and my impressions of it are positive. As I’ve said before there aren’t any really amazing wow features that will make you want to upgrade, but the performance improvements can in some cases be quite significant, the new BIDS is easier to use, and there are a few obscure fixes/changes in behaviour which tie up some loose ends left over from AS2005. Since migration is very, very easy indeed I would encourage you to install it on a test machine if you haven’t already and start thinking about moving up. Of course the mantra of ‘wait until SP1’ is so deeply ingrained in people’s minds that most people will want to do exactly that – and there’s a lot of sense to that approach, since the first bugs are being found already (see here) but equally there are a fair few known problems with AS2005 SP2 and given the problems that all of the CU releases have (see here for example, and I’ve heard the same story for every single CU, they create as many new bugs as they fix) I wouldn’t recommend them; I suppose you could wait for 2005 SP3 but my feeling is that AS2008 is the better bet.

Meanwhile, in the cloud I see that Good Data have gone into beta, and there’s a new, mysterious MDX-queryable (Mondrian-based?) offering that has broken cover called BI Cloud. If I have time, I’ll try to check them out. Also on the net seems to have a lot of good videos explaining the basics of AS. And there’s a new podcast featuring Richard Tkachuk from the SQLCat team where he talks about the performance improvements in AS2008 and seems to suggest that it’s now possible to use hints in MDX with a new function whose name I couldn’t make out – I’ll post if I get more details.

I’ve also been thinking some more about the DATAllegro deal. There seems to be some discussion about when something that works with SQL2008 can be released, and the folks at DATAllegro are keen to stress that their architecture allows them to plug in new RDBMSs easily so the implication is that it will be sooner rather than later; clearly the investigation work has been going on for a while, and must somehow tie in with the MatrixDB stuff that got leaked a few months ago. All of this would be good for AS running in ROLAP/HOLAP mode on a MPP SQL Server, but can this technology but I wonder whether it could be made to work with AS in MOLAP mode? I think it could – surely the hooks are already there with the remote partitions/linked measure groups/dimensions stuff. Just conjecture though; I think we’ll find out more around the time of PASS and the BI Conference.

Lastly, I’ve booked my place for Mosha’s MDX Deep Dive pre-conf seminar at PASS this year. Who else is going?

SQLBits and my training day

And before I disappear off on my hols for a few weeks, can I remind you that SQLBits is happening on September 13th and that you really ought to be there? We’re just about at the stage of finalising the sessions (and we’ve got a great BI track lined up) so check for more details!
As I mentioned before, Allan Mitchell and I will be doing a 1-day pre-conf seminar the day before (September 12th) on the Microsoft BI stack:
It’s an introductory session, so if you’ve got colleagues who want to get a good overview of what BI is and what you can do with the MS tools in this area, then send them along.

Bill Baker leaving MS

Bill Baker, pretty much the top guy in BI at Microsoft since Microsoft first got interested in BI, is leaving the company. Not that I’m reading anything much into the move though – after ten years he’s probably looking for a new challenge or some way of spending all that money he’s earned.
%d bloggers like this: