Since the first SQLBits was such a success, we’re doing it again – this time in Birmingham (that’s the UK one, obviously, not the one in America) on the 1st of March 2008 at the Lakeside Conference Centre. Put the date in your diary! For more information see: http://www.sqlbits.com/

If you came to the last one then you can expect the same thing again only bigger and better, and of course it remains free to attend. If you didn’t come to the last one and you’re wondering whether to believe the hype, then here are some comments plucked from the blogosphere that will hopefully sway you:

Registrations aren’t open yet but at this point we’re looking for sponsors (http://www.sqlbits.com/information/Sponsorship.aspx) and speakers (http://www.sqlbits.com/information/SessionSubmission.aspx). If you’ve got a SQL Server-related product or service to promote (and that includes you, Microsoft BI vendors and consultancies) then there will be no better opportunity to meet potential customers. And if you’re dying to impart your hard-won technical knowledge to your peers and want to take your first steps on the conference circuit, again SQLBits is the place to be; in particular for the BI track I’d be happy to hear from anyone new willing to speak about SSAS, SSRS, SSIS, PerformancePoint or MDM.

Multiple Assignments Bug

I don’t usually blog about the bugs I find in AS because, well, bugs usually get fixed pretty quickly and are usually only seen in very specific scenarios. However I’m going to make an exception here because this bug is very easy to reproduce, it can lead to inconsistent values being returned and it doesn’t look like it’s going to be fixed any time soon.

Here’s the repro on Adventure Works (I used the Simple version, but I’ll assume it works on the regular version too):

  1. Comment out the MDX Script apart from the Calculate statement
  2. Add the two following assignments:
    ([Measures].[Reseller Sales Amount])=1;
    ([Measures].[Reseller Sales Amount], [Employee].[Employees].&[291])=2;
  3. Run the following query on a cold cache and note that the value returned for the employee Stephen Y. Jiang is 11 (all other non-leaf members show values aggregated up from their leaf members too):
    select [Measures].[Reseller Sales Amount] on 0,
    on 1
    from [Adventure Works]
  4. Now run the following query which only returns data for Stephen Y. Jiang without clearing the cache and note that the value returned is now 1:
    select [Measures].[Reseller Sales Amount] on 0,
    on 1
    from [Adventure Works]
  5. Now rerun the query from step 3 and note that the value for Stephen Y. Jiang is still 1.

I have a case with PSS open about this, and from what I can gather the problem is to do with making multiple assignments to the same cell; I’ve only reproed it with a parent/child hierarchy, but I’m told it could appear on a regular hierarchy as well. In my opinion it’s the values returned in Step 3 that are incorrect since an assignment on the member Jae B. Pak should only trigger an aggregation from leaf level for that member’s siblings; the values from Step 4 for Stephen Y. Jiang and the other non-leaf members are right.

The job I’m working on at the moment is a financial application, and the only workarounds seem to be liberal use of FREEZE or making sure that you don’t try to assign values to the same cell more than once, neither of which are exactly feasible in my situation: I’m already dealing with a vast amount of code in the MDX Script and getting either approach to work is next to impossible. I would assume that most financial applications will have parent/child hierarchies (eg on an Accounts dimension) and multiple overlapping assignments in them, so be warned! I wonder if this is going to be a problem for PerformancePoint applications?

AS2008 Block Computation

A lot has been said about block computation as it’s the biggest feature coming in AS2008. Basically it means better performing MDX, but as you’d expect it doesn’t mean that all your MDX queries and calculations are going to run faster, just some. But how do you know whether you’re going to benefit? Well, digging through the latest update of AS2008 BOL today I found the following, very detailed description of when you will benefit from these performance improvements:
Very interesting… the list of set functions is a bit limited (where is Filter?) but in general it looks like all common calculations are going to get faster.

SQL2008 November CTP released

I’m a bit late on this (ie about a day, but that’s late in the blog world) but the November CTP of SQL2008 has been released and you can download it here:
I won’t go into the details of what it contains because various other people have already done so; the best round up I’ve seen is from Vidas Matelis:
Apparently the only feature missing from this CTP for AS is the ability to share a single AS database between multiple instances; there don’t seem to be any new features coming that I didn’t know about apart from some new Perfmon counters and some new Profiler trace events, and the rumoured query plans have not made the cut. Am I alone in feeling a little… underwhelmed by the new features in AS2008? Maybe that’s because I haven’t had a chance to test out the block computation changes yet (I might do this week though) – they’re not very obvious or demoable but they should hopefully make a big, big improvement to the performance of many calculations.

Enterprise Cube

Picked up from Dan Meyers and Peter Koller, it looks like Microsoft are working on some packaged BI solutions aimed at specific verticals which will be sold under the brand name ‘Enterprise Cube’. Apart from Mark Kromer’s blog, which Dan points to, the only other information I can find about it is this session abstract from the Australian BI summit, which describes the first of these solutions aimed at the telco industry:

Microsoft Enterprise Cube – BPM Solutions for Today’s Business Needs
Haresh Khatwani, Group Product Manager MEC, Microsoft & Daniel Yu, Product Manager, Industry Solutions Group, Microsoft

Microsoft Enterprise Cube is the only packaged end-to-end BI offering at Microsoft and provides the framework for pain-point-specific BI solutions for industry verticals. Using case studies form leading telecommunications companies in the USA and Korea this session will show you how the MEC framework enables modularity, quick installation, quick time-to-market and how your organisation can leverage the Microsoft BI product stack to report, analyze, trend and forecast business data without effecting live business support systems.

I’m always sceptical about the technical value of generalised industry solutions, except when the data involved varies so little in format that you can genuinely stick a pre-defined cube on top of anyone’s data with the minimum of customisation. For example, solutions on the Microsoft BI platform that I know about that can do this include Data Intelligence’s PharmaAnalytics solution for pharma market research data, and PrecisionPoint’s solution for Dynamics, and I’m sure there are hundreds of other similar ones out there. Perhaps they could be brought into the ‘Enterprise Cube’ fold too under some kind of partner programme?  

From a marketing point of view, though, I think it makes a lot of sense – it will bring on board a lot of customers and partners who would otherwise not look at the Microsoft BI platform, and as Peter comments this is something the other BI vendors have been doing for a long time and makes Microsoft look more credible in this space.

Viewing Partition Slice Information

The other week I discovered that it was possible to view the partition slice information discussed here using the DISCOVER_PARTITION_DIMENSION_STAT schema rowset. I’ve always thought it would be cool to be able to visualise this information so you could easily check to see whether your dataid ranges overlapped or not; to do this, though, you would need to see the slice information for all your partitions in one resultset and DISCOVER_PARTITION_DIMENSION_STAT only returns data for one partition. So this week I dusted off my coding skills and wrote a sproc that loops through every partition in a measure group, calls DISCOVER_PARTITION_DIMENSION_STAT and concatenates the results as well as checking to see which partitions have ranges which overlap each other. I’ve added it to the source code for the Analysis Services Stored Procedure Project, which you can get hold of here:


Note that the function isn’t part of the latest release, you will have to download the code and compile it yourself. As an example of what it does, in SQLMS if you connect to the Adventure Works database and then run the following statement:

call assp.DiscoverPartitionSlices("Adventure Works", "Internet Sales")

…you’ll get a resultset showing all the partition slice information for every attribute on every partition in the Internet Sales measure group.

Having done this I was able to put together a simple SSRS report to display this information. I say simple, it probably took me longer to create the report than to write the sproc because I couldn’t work out how to visualise the ranges in a useful way; in the end I used the RangeColumn chart style from Dundas, which seemed to be the best fit. Here’s an example of what the report returned for the Date attribute on the Date dimension and the Internet Sales measure group:


I cheated a bit by setting the IndexBuildTheshold very low so that indexes would be created for the 2001 and 2002 partitions, but it shows what a healthy set of slices looks like (the Internet Sales measure group is partitioned by Calendar Year). Another attribute worth looking at is the Product attribute on the Product dimension:


What we see here is that in 2001 and 2002 only a small number of products were actually sold compared to 2003 and 2004. In this situation it might be worth trying to reorder the members or set a slice on 2001 and 2002 so that AS knows about this distribution: it would mean that if you were running a query for the sales of a Product that was only sold in 2003/4 for all time periods, AS would not need to scan the 2001 and 2002 partitions because it would know in advance that there would be no data for that product in those partitions. One of the improvements I would like to make to the sproc is to display member names and unique names as well as dataids so that the slice ranges are easier to understand; Mosha also had the idea that it would be good to be able to take these unique names and use them to set a slice on a partition for exactly the scenario I’ve just described.

Lastly, I found what is almost certainly a bug in AS when I was looking at the Calendar Semester attribute on the Date dimension:


Calendar Semester has, as you would expect, a one-to-many relationship with the Year attribute and we would expect to see no overlaps as a result -but as the report shows, that isn’t the case. When you run a query on this attribute, for example:

select measures.[internet sales amount] on 0,
{[Date].[Calendar Semester].&[2004]&[1]} on 1
from [adventure works]

You see partitions being hit when they shouldn’t be hit – in this case the 2003 and 2004 partitions – and this even sometimes seems to happen when you have a slice explicitly set on the partition using the Year attribute, which is pretty bad. Greg Galloway did a bit of digging on this and discovered that the SQL generated returned rows in the same mixed-up order as the dataids of the members, and fixing the SQL to return members in the order you want the dataids stops this happening to a certain extent. Mosha says that it’s a problem with the decode tables created within the dimensions, but whatever the cause you may want to check your dimensions to see if it’s happening to you. This kind of problem has been noticed before and I knew about the Disable Prefetch Facts connection string property that was introduced to deal with it, so I wonder if this is related in some way? Anyway, I logged the bug on Connect and I’ll blog about any new information that I get on it:


Enterprise Search and BI

I notice from various sources (for example, Don Dodge) that Microsoft have released a free version of their Enterprise Search product, Microsoft Search Server 2008 Express. The thing that caught my eye was the list of federated connectors:


…which includes Business Objects, Cognos, SAS, but there’s no mention of Reporting Services or Analysis Services anywhere. As I think I’ve said here before, I’m not convinced that a search interface on top of a BI platform is going to be useful in the real world (though I bet you could do some cool demos with it): I suppose if you have hundreds of SSRS reports for example you might want to look for the ones that contain figures for a particular Product or Customer, but I would have thought that it’s just as likely that you’d do a search, find a report and then find you don’t have permission to view it. As for using a search interface as a way of querying a cube, all I have to say about that is two words: English Query.

But I think there’s a more interesting application for BI here: what if you could build a cube off the index this thing creates? You could have dimensions like Date Updated, Keyword, File Type and Path, and measures like Count of Files and File Size; you’d be able to do things like create reports which tracked the overall space taken by mp3 files on your network and where these files were, the number of emails with the phrase "new job" in; even just browsing ad hoc in Excel you’d have a new way of searching for files: for example, you could slice on File Type=Word doc, Keyword="CV" or "Resume" and then put the Path dimension on rows and drill down to find all the CVs on your network.

%d bloggers like this: