Handling Relative Time Periods

I recently had an interesting exchange with Michael Barrett Jensen and Thomas Pagel on the MSDN Forum about the best way to handle the problem of relative time periods, ie letting the user select a member called something like ‘current month’, which always returns the most recent month’s data (see http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=346119&SiteID=1 and please ignore the posts in the middle of the thread where I talk rubbish about when sets are evaluated). Now for some scenarios it’s a good idea to use named sets to hold the ‘current month’ or the last five months or whatever you want – if you have a client tool which handles sets well, then the advantage of this is that it works as a kind of dynamic selection and you get set of members you want each time. However not all front ends (eg Excel) support sets, they can be a difficult concept for some users to understand, and they are a bit inflexible. So as an alternative here’s the outline of a method for building a relative time dimension to add to your cube – it’s more of a sketch than a robust solution, so I’d be interested in hearing feedback from anyone who tries to implement it.
 
Anyway, the first thing I did was create a new table in Adventure Works DW from which I built my Relative Time dimension. For the purposes of this exercise, I only created two columns/attributes: Relative Year and Relative Quarter. Relative Year had two members, Current Year and Current Year-1, and Relative Quarter had eight members, Current Quarter to Current Quarter-7. I then added this to the Adventure Works cube, but did not make any changes on the Dimension Usage tab, so it had no relationship with any measure group. I then added the following code onto the end of the MDX Script:

//Relative Time

([Relative Time].[Relative Quarter].[Relative Quarter].Members, [Measures].[Internet Sales Amount]) =

([Measures].[Internet Sales Amount],

tail

([Date].[Calendar].[Calendar Quarter].members,

rank

([Relative Time].[Relative Quarter].currentmember, [Relative Time].[Relative Quarter].[Relative Quarter].members)

).

item(0)

);

([Relative Time].[Relative Year].[Relative Year].Members, [Measures].[Internet Sales Amount]) =

([Measures].[Internet Sales Amount],

tail([Date].[Calendar].[Calendar Year].members,

rank

([Relative Time].[Relative Year].currentmember, [Relative Time].[Relative Year].[Relative Year].members)

).

item(0)

);

What this does is overwrite the values displayed in the cube for the measure [Internet Sales Amount] and everything below the All Members of the two attributes on Relative Time – Current Year shows the value of the last member on the [Date].[Calendar].[Calendar Year] level, and so on. You now have a Relative Time dimension in your cube that users can use in their reports, but which has no impact on aggregations or processing. You can change the scope of the assignments to cover all the measures in your cube fairly easily.
 
The advantages of physically building the dimension, rather than using calculated members to do the same thing, are that a) you get drill down from Relative Year to Relative Quarter and b) your MDX Script doesn’t get cluttered with (potentially) hundreds of calculated members.
 
As I said, this is really just a proof of concept. I’m not 100% satisfied with the way this works (for example I’d like to see it return something nicer than repeated values when you crossjoin Relative Year and Calendar Year, and the Root on Relative Time doesn’t show sensible values unless you have as many relative periods as actual periods), but I thought it was worth posting in case anyone was interested.

PROOF

Seen on Claudia Imhoff’s blog today, PROOF seems to be an interesting new tool from Symmetry Corporation:
it sounds like it does roughly what the old SSABI (SQL Server Accelerator for BI) did. I’ll have to see if I can get an eval copy…

Solve Order in AS2005

You may already be aware – possibly the hard way – that solve order has changed in AS2005. Notably, calculated members defined with query or session scope can’t have an effective solve order lower than any calculated member defined on the server, the only exceptions being calculated members which use the AGGREGATE() function and (I believe) ones which use visualtotals functionality.
 
Here’s an example of this happening, using the [Gross Profit Margin] calculated measure in AdventureWorks:

with member [Product].[Category].AggTest as Aggregate([Product].[Category].[Category].members)

member

[Product].[Category].SumTest as Sum([Product].[Category].[Category].members)

select

[Measures].[Gross Profit Margin] on 0,

[Product].[Category].

allmembers on 1

from

[Adventure Works]

In this case you can see Aggtest is evaluated at a lower solve order than [Gross Profit Margin], whereas Sumtest is evaluated at a higher solve order. MDX Solutions has a great writeup of this behaviour (courtesy of George) so I suggest you buy a copy if you’d like to know more about the details; it gets even more fun when you want to create calculated members with query or session scope which have solve orders higher than some server calcs and lower than others – one way of doing this is to use SUM() for the former and AGGREGATE for the latter depending on what the currentmember on Measures is, eg IIF(Measures.CurrentMember is Measures.ABC, SUM([MySet]), AGGREGATE([MySet])).
 
What isn’t in the book, though, is what has been happening since RTM. One post on the MSDN forums that I’ve pointed to before refers to a property called CalculationPassMode which will be in SP1:
However, as Mosha comments in his reply, there are some unpleasant side effects. There is apparently another property coming called SCOPE_ISOLATION which won’t be in SP1 which will give us back the AS2000 behaviour of query and server solve orders working properly together. I’ll let you know if and when it does appear.
 
 

Local Cubes and .NET 2.0

Here’s a little nugget of information I thought I’d post here for future reference: if you want to create local cubes direct from relational sources in AS2005 you need to have .NET 2.0 installed on your machine, although you don’t need to if you’re creating your local cubes from server cubes. I’m sure someone will find this useful… thanks to Siva for letting me know about this.

I’m an MVP!!!

I found out earlier this week but I thought I’d mention it here. Obviously I am very chuffed indeed. Time for a tearful Oscars-style acceptance speech: I’d like to thank everyone out there who has helped me over the years, whether at work, on the newsgroups or on this blog… (sniff)… I love you all!

Microsoft buys Proclarity

This is all over the blogs today, and with good reason: it’s the biggest news in the Microsoft BI world since, well, the release of OLAP Services. Here’s the press release:
 
Now, why has this happened, what does it mean and is it a good thing? Well, the first question is fairly easy to answer. Microsoft has for a long time suffered from not having a complete set of tools when it comes to BI: customers had to buy their server from one place (Microsoft, a big, fairly well trusted company) but unless they were willing to use Excel pivot tables (and few were) then got told to go and look at all the other client tools on the market which meant another round of evaluations, another swarm of salesmen to talk to and significant extra expense. It didn’t help that all these other client tool companies were relatively small and unknown and their products were, let’s face it, sometimes lacking in terms of quality and finesse. Certainly some companies appreciated having a choice, but in my experience the majority, especially the enterprise customers, didn’t. So as a result companies like Cognos and Microstrategy benefitted from being seen as the ‘one stop shop’ for BI solutions.
 
This situation was the result of a professed strategy by Microsoft to deliberately stay out of the client tools market. The thinking was that Excel and the other MS client tools would take the low-hanging fruit and leave the rest of the business to third parties like Proclarity; competition between these third parties would stimulate innovation and provide the customer with choice. In my opinion though this innovation didn’t really appear and the only choice the customer got was between similar tools, so that and the point I made in the previous paragraph about customers having to buy from two vendors is I think why this strategy got dumped. Interestingly, I heard from a Proclarity guy ages ago that MS tried to buy them in 2001 but Proclarity turned them down (MS going on to buy what became Data Analyzer instead); I would imagine that Proclarity changed their mind about being bought when they realised that the new functionality in Office 12 would mean that Microsoft would not only be taking the low-hanging fruit but just about everything edible on the lower-half of the tree, so providing much stiffer competition.
 
Is this a good thing then? For Microsoft and for its customers, yes. Proclarity is a good tool – I was impressed with what I saw of its latest release when I saw it a few months ago, and it seemed to be the tool on the market that took greatest advantage of the new features of AS2005. It also provides much needed infrastructure which now, presumably, will be much more closely integrated with the rest of Microsoft’s tool suite; I would guess that Microsoft will be investing even more in development to make the tool even better. It will be interesting to see what happens with pricing too – I would guess that licence costs will go down, with the obvious effect of making the Microsoft BI suite even more attractive on that front.
 
Of course there are going to be some losers in this situation, and in this case its all the other third party tool vendors. Panorama especially seem to be screwed – why buy their stuff when you can buy the equivalent tools directy from Microsoft? I can see that if the integration of Proclarity into Microsoft takes a long time, or if the integration of Proclarity tools with other Microsoft tools is too tight (eg if customers don’t use Sharepoint, or are unwilling to upgrade to Office 12, then they can’t use the other MS BI tools) then there might be some short-term opportunity for them, but in the long run I can’t imagine they’ll prosper. Then there are all the much smaller tool vendors, who carved out niches based on price or specialist functionality or who were tied to consultancies: times will be much harder for them now they’re seen as being in direct competition with Microsoft and their cost advantage is eroded, and many of them will disappear too. Listen out for the sound of wailing and gnashing of teeth in the partner community over the next few months… the comments on Mosha’s blog entry about this are only a start:
It seems a shame that this is happening but at the end of the day Microsoft’s first and only priority is its bottom line and its customers, and as I’ve said from that point of view this is the right move.

Book Review: Data Mining with SQL Server 2005, by Jamie MacLennan and ZhaoHui Tang

Data mining has been one of the most widely touted of the new BI features of SQL 2005. Anyone who’s been in this industry for any length of time (and I haven’t been for all that long, really, only about 8 years now) will know that it’s been ‘the next big thing’ for ages but has never seemed to break through into the mainstream – whether Microsoft will succeed now will be in part down to whether they can make it easy enough for developers to understand what is a quite difficult technology, and having a good book available to educate us developers is an important step on the road to doing that.
 
So is ‘Data Mining with SQL Server 2005’ any good? Overall, yes: I’ve been reading through it very slowly since I got it late last year, and I certainly feel like I’m much closer to being able to do a real data mining project than I was before. It provides very readable explanations of how each of the algorithms available in AS data mining work (with one exception – see below), managing to give just enough detail on the theory behind them without being too dry and academic. It also covers the wider issues such as the lifecycle of a data mining project and other data mining tools and standards on the market, and it’s especially strong on the programming side too with plenty of examples of DMX and how to integrate data mining into your own .NET apps.
 
There are some criticisms to make though. Text mining is not covered in anywhere near enough detail, getting only two or three pages where it probably deserved its own chapter; if you’re looking for an explanation of how Term Extraction actually works I recommend you go to chapter 6 of "Professional SQL Server Integration Services" instead. It also has a somewhat uncritical tone, which is probably to be expected given that it was written by two of the lead developers of the product: it doesn’t point out any bugs or quirks, and doesn’t pre-empt any of the mistakes that novice users are likely to make in the way that good tech books do.
 
To sum up, this book is definitely worth buying if you’re interested in SQL2005 data mining – but don’t expect to have mastered the subject just by reading it!
 
If you’re in the UK, you can buy ‘Data Mining with SQL Server 2005’ here.
 
 UPDATE: I’ve just had a mail from Sean McCown, pointing me to his site where he’s also reviewed this book:

AS2005 Dynamic Security Performance

There’s an interesting post on AS2005 dynamic security performance over on David Gustafson (of Proclarity)’s blog:
 
UPDATE: see also the follow-up post:

Redundant Attribute Relationship Health Warning!

My ex-colleague Jon might spend far too much time pseudonymously posting silly comments on blogs, but he deserves an extra helping of chocolate mousse for finding this problem to do with redundant attribute relationships – this is something everyone designing AS2005 cubes needs to be aware of!
 
Imagine you had a Geography dimension with three attributes: City (the key attribute), State and Country, where each Country has many States, and each State has many Cities. Now you probably already know that if you have built your dimension with the wizard, the chances are that your dimension will have two relationships pre-built – between City and State and City and Country – and that you need to change this so that there are relationships between City and State and State and Country in order to get optimal performance and to help aggregation design. AS2005 understands transitive relationships between attrbutes, and so even though there are just two relationships built in your dimension it knows that because City is related to State and State is related to Country, there is a transitive relationship between City and Country.
 
BUT, did you know that bad things can happen if you actually build the relationship between City and Country in your dimension? Before Jon found out about this, I had understood that the only effect was that having this redundant relationship might increase the size of the dimension and increase processing times. However, it turns out there are two other potentially serious side-effects:
1) MDX Script Scopes may include unexpected extra cells. So, for example, if you’re assigning values to a subcube in your MDX Script (and most of the calculations created by wizards in BI Dev Studio do this) then you may see incorrect results with the redundant attribute relationships in place.
2) Querying fact data may ignore some slices or return Nulls on certain related attributes when the measure group has granularity on an attribute different from the key attribute.
 
Apparently what’s happening is that, using our example, the redundant relationship between City to Country would cause AS to ignore the legitimate relationship between State and Country. Even more confusingly, if you were to create a user hierachy from Country to State to City then the incorrect behaviour would go away.
 
So the moral of all this is to be extra-careful when designing your dimensions and not to let any redundant attribute relations remain in there. This problem won’t be fixed in SP1 but there will be a warning in the dimension editor UI when redundant relationships are detected, though, which should make more people aware of it.
 
Apart from Jon, thanks also to Matt Carroll and Marius Dumitru on the AS dev team for taking the time to explain what’s going on here to me.
 
 

SQL2005 SP1 CTP Released

The CTP of SP1 for SQL2005 is now out:
There’s also an updated version of BOL. No idea what fixes exactly have been included though because the link to the KB article in the readme file is dead! However, for Analysis Services, I’ve heard about one potentially interesting change with solve order – see the following thread on MSDN forums for the full story:
I wonder if the ability to cache the results of MDX Script calculations is also going to be there?
 
UPDATE: The KB article is now up and there are lots of AS fixes in there. Here’s the link: