Solving the ‘Events in Progress’ problem in MDX, Part 2–Role Playing Measure Groups

In my last post I described a simple solution to the ‘events in progress’ problem in SSAS and MDX, which nonetheless had one drawback: it involved loading all the data from the same fact table into two different measure groups in the same cube, which of course potentially doubles the amount of time taken to process the cube and its size on disk. I used the same technique in my recent post on improving the performance of currency conversion, and indeed it’s a technique that I have used in the past with several customers successfully; but it seems rather inelegant, so is there a way we can avoid doing it and only have one measure group? It doesn’t seem to be possible at first glance and I actually opened a Connect a while ago asking for this functionality (‘role playing measure groups’) to be implemented. I was having a good week last week, though, and at the same time as solving the ‘events in progress’ problem I also came up with a solution to this problem too…

Here’s what I did, using the ‘events in progress’ problem as an illustration:

First of all, I created and processed a simple cube called ‘EventsInProgressSource’ with a single measure group, a single Count measure, and two dimensions, Order Date and Ship Date having regular relationships joining on the OrderDateKey and ShipDateKey columns on the fact table:

I then created a second cube, ‘EventsInProgress2’, using the ‘Create Empty Cube’ option in the Cube Wizard.

I then started the New Linked Object wizard and copied everything from the ‘EventsInProgressSource’ cube to the EventsInProgress2 cube:

This resulted in a cube with one linked measure group and two linked dimensions:

Now here’s the fun bit. I then deployed and processed the cube, closed BIDS and went to SQL Management Studio. There I scripted the Internet Sales Facts linked measure group in ‘EventsInProgress2’ to an XMLA Create statement, then manually updated the XMLA by adding a 1 to the end of the name and ID of the object itself and the Line Item Count measure, then executed the script against the ‘EventsInProgess2’ cube. This created a second, identical linked measure group – something that again BIDS doesn’t let you do. I then reopened BIDS and connected direct to the cube in online mode (I could also have reimported the project back into BIDS) and went to the Dimension Usage tab, then deleted the relationship between Ship Date and the first linked measure group and Order Date and the second linked measure group, leaving the relationships like this:

I then added another Date dimension and set up referenced relationships (which had to be non-materialised) with each measure group via the Date attributes of the Ship Date and Order Date dimensions:

With this done we have achieved out goal: we have the same fact table appearing twice in the same cube as two different measure groups with different dimensionality, but we are only processing the data once. The last step to solve the ‘events in progress’ problem is to add what is essentially the same MDX as last time to the cube:

CREATE MEMBER CURRENTCUBE.MEASURES.OrderedToDate AS
SUM(NULL: [Date].[Calendar].CURRENTMEMBER,
([Measures].[Line Item Count]));

CREATE MEMBER CURRENTCUBE.MEASURES.ShippedToPrevDate AS
SUM(NULL: [Date].[Calendar].CURRENTMEMBER.PREVMEMBER,
([Measures].[Line Item Count1]));

CREATE MEMBER CURRENTCUBE.MEASURES.OrderedNotShipped AS
MEASURES.OrderedToDate – MEASURES.ShippedToPrevDate;

And we get the results we want out of the cube:

Now I haven’t tested this with anything other than the Adventure Works data, and there are some areas where I’d like to do more testing:

Non-materialised reference relationships don’t always perform that well. Materialised reference relationships aren’t allowed with linked measure groups though, so possibly using a m2m relationship to join the Date and Order Date/Ship Date dimensions might be an alternative
There are various ways of calculating a total-to-date in MDX and I’ve not spent any time working out if the version I’ve shown above is the most efficient.
There is going to be an overhead from querying a linked measure group rather than a regular measure group, and also probably an overhead from trying to query the same source measure group twice rather than two different measure groups, but I don’t know how significant it is.

If anyone out there does decide to try this at scale with their own data, please leave a comment and tell me about your experiences.

Solving the ‘Events in Progress’ problem in MDX, Part 1

I wouldn’t admit to this in public (at least to an audience of normal people, rather than the Microsoft BI geeks reading here), but over the last few years I’ve been obsessed with solving the ‘events in progress’ problem in SSAS and MDX. I’ve tackled it successfully in PowerPivot and DAX (see here and here) but I always thought there was no good solution in SSAS, and certainly other people such as Richard Tkachuk who have written about the same issue have come to a similar conclusion. But… last week I found one! So in this blog post I’ll outline the basic approach, and in my next post I’ll detail an even better, if more complex to implement, solution.

The two blog posts linked to above describe the problem in detail, but here’s a quick overview. In the Internet Sales fact table in Adventure Works each row represents a line item on an invoice, and each line item has an Order Date (ie the date the order was received) and a Ship Date (ie the date the order was shipped to the customer). We would like to know, on any given date, how many invoice line items are part of orders that have been received but not yet shipped and so have an Order Date before that date and a Ship Date after that date.

The key to being able to solve this problem in SSAS and MDX is how you model the data – my previous difficulties were down to the fact that I had the obvious way of modelling it, with one measure group having regular relationships with an Order Date and a Ship Date dimension, fixed in my head. However if you take a completely different approach the problem becomes easy, and here’s my worked solution using the Adventure Works data:

First of all I created my cube with one Date dimension and two measure groups, both based on the Internet Sales fact table:

Since BIDS won’t let you build two measure groups from the same table in the DSV, I created a named query that duplicated the Internet Sales fact table and used that as the basis of the Ship Dates measure group. Both the Order Dates Count and Ship Dates Count have AggregateFunction set to Count, and the Date dimension joins to Order Dates on OrderDateKey and Ship Dates on ShipDateKey.

I then created the following calculated measures:

CREATE MEMBER CURRENTCUBE.MEASURES.OrderedToDate AS
SUM(NULL: [Date].[Calendar].CURRENTMEMBER,
([Measures].[Order Dates Count]));

CREATE MEMBER CURRENTCUBE.MEASURES.ShippedToPrevDate AS
SUM(NULL: [Date].[Calendar].CURRENTMEMBER.PREVMEMBER,
([Measures].[Ship Dates Count]));

CREATE MEMBER CURRENTCUBE.MEASURES.OrderedNotShipped AS
MEASURES.OrderedToDate – MEASURES.ShippedToPrevDate;

The first, OrderedToDate, gives the total number of line items ordered from the beginning of time up to the current date. The second, ShippedToPrevDate, gives the number of line items shipped from the beginning of time up to the date before the current date. If we subtract the second value from the first, as we do in the OrderedNotShipped measure, we can therefore find the number of line items have been ordered but not yet shipped on any given date:

And it’s pretty fast, too – certainly much faster than any other solution I’ve tried before in SSAS. There’s one obvious catch here though, and that is that you have to load the data from your fact table into your cube twice. What if you have a really large fact table, and doubling your processing time and storage in this way is not an option? Well, stay tuned for part two when I’ll show you a trick to avoid doing this and only use one measure group.

SQLBits 8 Registration Now Open

We’ve just opened registration for SQLBits 8, which will be taking place in Brighton (in the UK) on the 7th-9th April. Full details can, of course, be found on the SQLBits website at http://www.sqlbits.com/ and you can register here.

It’s going to be a full three-dayer, and we’ve got loads of amazing stuff planned for it that we can’t talk about just yet, but there’s still plenty we can say publicly. For example, for anyone interested in Microsoft BI on the pre-conference seminar day we’ve got Marco and Alberto on their SQLBI methodology and Thomas Kejser on optimising the MS BI stack amongst other things (though not me this time). Also, take a look at the list of sessions that have been submitted – there’s some great stuff there too. It’s the SQL Server event of 2011 in Europe, without a doubt (especially since it doesn’t look like there will be a PASS European Conference this year).

Hope to see you there!

Google Docs can access data from BigQuery

I don’t know how I missed this news from last week (luckily I saw it relayed here), but this is important: Google have announced that you’ll be able to access data stored in BigQuery direct from Google Docs:
http://googledata.org/google-docs/bigquery-meet-google-spreadsheets/

I’ve been following BigQuery and Google’s BI efforts for a while, and it looks like it’s only a matter of time before they have something resembling a pivot table built into a Google Docs spreadsheet – and that would be a very significant development in the world of web-based BI. Think about it: vast amounts of data stored in BigQuery, accessible through a SQL-like interface and able to be queried very quickly; a spreadsheet on the front; all that’s needed is some kind of OLAP-like-layer in between to make it easy for end users to build their own queries. And Google have done similar things to what’s necessary here with Fusion Tables and their collaboration with Panorama.

Microsoft really need an answer to this. We know SSAS in the cloud is coming, and we have the Excel web app, but again we need pivot table support in the web app to complete the picture (see here and here for more discussion on this topic). I hope it comes soon!

PowerPivot Videos from SQL Server Day 2010

Last month I had the pleasure of speaking at SQL Server Day 2010 in Belgium (I had a nightmare getting there and back because of the snow, but that’s another story). I’ve just seen that all the videos from the event are now available to view online and download here:
http://sqlserverday.be/video/

Both the sessions I gave are up: “Implementing Common Business Calculations in DAX” and “Comparing Analysis Services with PowerPivot”.

On a related note, we’ve aggregated all the content we’ve got for every single SQLBits on a new page on the SQLBits site:
http://sqlbits.com/content/
As far as I can see there are 152 videos on there alone!

Tuning the Currency Conversion Calculations created by the Add Business Intelligence Wizard

I see the ‘Add Business Intelligence’ wizard in BIDS as a bit of a missed opportunity on Microsoft’s part: it was a great idea to have functionality that would automate difficult stuff like adding MDX calculations to a cube, and it’s a shame that most of the MDX code it generates is so rubbish. Take, for example, the MDX currency conversion calculations that it creates. If you’re using Enterprise Edition the best way to do currency conversion is not to use MDX at all but to use measure expressions, but if you have Standard Edition you have no choice but to use MDX to do your currency conversion and if you use the calculations the wizard produces you’d probably think that this type of calculation has to be painfully slow. That’s not true, though – and in this post I’ll show you how to replace the currency conversion calculations created by the wizard with some that are not only much simpler but also much, much faster.

Let’s use a simple cube to illustrate this. In the Adventure Works database I created a cube with two measure groups: one based on the Internet Sales fact table, the other based on the Currency Rates fact table. I then added the Currency dimension and the Date dimension (joining on OrderDateKey in the Internet Sales table), and ran the ‘Add Business Intelligence’ wizard to add many-to-many currency conversion to the cube. Here’s what my Dimension Usage tab looked like after I’d run the wizard (note that it added a new Reporting Currency dimension to the cube):

And here’s the MDX that it adds to the MDX Script:

// <Currency conversion>

        // Currency conversion wizard generated script.
        // Currency conversion generated on: 11 January 2011 21:05:19
        // by user: Chris
        // Currency conversion type: ManyToMany
        // Selected members to be converted: Sales Amount
        // Please be aware that any changes that you decide to make to it may be
        // overridden the next time you run the Currency Conversion wizard again.

        // This is the Many to One section
        // All currency conversion formulas are calculated for the pivot currency and at leaf of the time dimension
        Scope ( { Measures.[Sales Amount]} );
            Scope( Leaves([Date]) ,
                [Reporting Currency].[USD],
                Leaves([Source Currency]));



              // Convert Local value into Pivot currency for selected Measures that must be
              //converted with Measure rate [Average Rate]
              Scope( { Measures.[Sales Amount]} );

               This = [Reporting Currency].[Local] / Measures.[Average Rate];

              End Scope;

            End Scope;

            // This is the One to Many section
            // All currency conversion formulas are calculated for
            //the non pivot currency and at leaf of the time dimension
            Scope( Leaves([Date]) ,
                Except([Reporting Currency].[Source Currency Code].[Source Currency Code].Members,
                {[Reporting Currency].[Source Currency Code].[Source Currency Code].[USD],
                [Reporting Currency].[Source Currency Code].[Source Currency Code].[Local]}));



            // This section overrides the local values with the Converted
            // value for each selected measures needing to be converted with Measure rate [Average Rate]…
            // LinkMember is used to reference the currency from the
            // source currency dimension in the rate cube.
            Scope( { Measures.[Sales Amount]} );
                   This = [Reporting Currency].[Source Currency Code].[USD] * (Measures.[Average Rate], LinkMember([Reporting Currency].[Source Currency Code].CurrentMember, [Source Currency].[Source Currency Code])) ;
            End Scope;


            End Scope; // Leaves of time, all reporting currencies but local and pivot currency
        End Scope; // Measures

// End of the currency conversion wizard generated script
// </Currency conversion>

Scary, eh? I won’t explain what it does in detail, but basically it does the following:

Converts the various local currency values held in the Internet Sales fact table into US Dollars
Converts these US Dollar amounts into whatever Reporting Currency the user has selected

It’s a lot of code for something so straightforward, and the reason why so complex is because it needs to use the Currency Rates measure group for both sets of currency rates: from local currency to US Dollars, and from US Dollars to reporting currency. As we all know overly-complex code is usually poorly-performing code, and in this case the use of LinkMember in particular is a killer; also there are no checks for situations where exchange rate values don’t exist, which is a fairly basic mistake.

Take the following query:

SELECT [Date].[Calendar Year].MEMBERS ON 0,
NON EMPTY
[Reporting Currency].[Source Currency Code].MEMBERS ON 1
FROM [CurrencyTestWizard]
WHERE([Measures].[Sales Amount])

It displays all Years on columns and all Reporting Currencies on rows for the Sales Amount measure, so it’s doing quite a lot of work. On my quite beefy laptop, using the cube I’d just created, it ran in just under 9 seconds on a cold cache to give me this:

We can do a lot better than 9 seconds though. The first step to optimising this is, as always, to push complexity out of the MDX and back into the cube design. Given that a fact table containing exchange rate values is usually pretty small relative to other fact tables, what I decided to do was to add a second measure group to my cube based on exactly the same Exchange Rate fact table, use the original measure group for the conversion from local currency to US Dollars and then the new one for the conversion from US Dollars to reporting currency. This did increase both the size of the cube and the amount of time taken to process it, but only by a tiny amount.

BIDS didn’t let me create a second measure group from a fact table that was already in use, so I had to create a new named query in BIDS that did a SELECT * from the Exchange Rate fact table and then use that. Having done this, I was able to set up my Dimension Usage as follows, with Reporting Currency joining to this new measure group:

Now, I could replace the code above with the following MDX:

SCOPE({[Measures].[Sales Amount]});
    SCOPE([Date].[Date].[Date].MEMBERS);
        SCOPE([Source Currency].[Source Currency Code].[Source Currency Code].MEMBERS);
            SCOPE(EXCEPT([Reporting Currency].[Source Currency Code].[Source Currency Code].MEMBERS
                        ,{{[Reporting Currency].[Source Currency Code].&[2147483647]}}));

                THIS = IIF([Measures].[Average Rate – Reporting Currency Rates]=0
                            , NULL
                            ,IIF([Measures].[Average Rate]=0
                                , NULL
                                ,([Measures].[Sales Amount] / [Measures].[Average Rate]))
                                * [Measures].[Average Rate – Reporting Currency Rates]);
            END SCOPE;
        END SCOPE;
    END SCOPE;
END SCOPE;

What I’m doing here is using a single set of nested scoped assignments to perform the calculation at the leaf level of the Date, Source Currency and Reporting Currency dimensions. With two different measure groups containing exchange rates there’s no need for the use of LinkMember – I just use two different measures for each step in the conversion – and I make sure I don’t bother doing any conversions where exchange rate values don’t exist.

When I ran my test query, I got exactly the same results but the query ran in around 0.5 seconds on a cold cache – 18 times faster than on the original version of the cube! It just shows you what a few tweaks to your cube design and MDX can achieve, doesn’t it? It’s a shame that MS didn’t invest a bit more time on the ‘Add Business Intelligence’ wizard to improve it – it would probably have improved the quality of many implementations no end. In the future, with PowerPivot and BISM being focused so much on ‘self-service’ BI, I think something similar to the wizard is a ‘must have’ to help users create complex DAX calculations, and I hope MS realise that functionality like this can be key to the overall success of a product.

PowerPivot Training in London (and elsewhere)

As you’re probably aware by now, PowerPivot and DAX are important parts of the Microsoft BI story going forward – self-service BI is the big new thing, while on the corporate side BISM will use DAX as its calculation language and BISM models will essentially be PowerPivot models. So it makes sense to learn PowerPivot and DAX as soon as possible to get a head start, right?

While that thought’s in your head, I’m pleased to mention that I’m helping Marco Russo and Alberto Ferrari (authors of the excellent book “PowerPivot for Excel 2010”) organise the London leg of their European PowerPivot training tour. Marco has more details on the course on his blog here, and you can find the course website here:
http://www.powerpivotworkshop.com/

Whether you’re a BI professional or an Excel power user, I think this represents an excellent opportunity to get training from acknowledged PowerPivot gurus. I’ll be there in the audience!

Report Portal

As I say whenever I talk about third party products here, I don’t do reviews on my blog, I just highlight products that look interesting and probably deserve closer inspection. So here is a non-review of a client tool I had a demo of today, Report Portal, a thin client, pure-html solution that includes both ad-hoc browsing capabilities and dashboarding/reporting. Although the UI looks a little dated, the fact that nothing needs to be installed on the desktop, that there are no requirements that might fall foul of corporate IT policies (which might, for example, rule out a Silverlight solution), and that it is licensed on a per-server basis do remove a lot of potential deployment headaches.

Rather conveniently, Igor, the guy who gave me the demo, realised that I’ve seen hundreds of client tools for SSAS and just showed me the features that make this particular product stand out from the pack, so that’s what I’ll talk about. Here are the main points:

It automatically creates a number of date calculations and relative time selections (such as ‘current month’, ‘previous month’) for you, meaning you don’t have to develop them yourself.
On time dimensions you can set up selection by a date picker, and also do date range selections by selecting a start and end date.
It can also do cascading parameters rather like what’s possible in SSRS or with Excel 2010’s slicers.
For drillthrough it allows you to build your own drillthrough query, select which measures you want, which attributes/columns you want and the order you want the columns to appear
There are a wide range of charts and visualisations to choose from, including an interesting (although possibly not Stephen-Few-approved) pie-chart tree report.
There’s also a load of other features, such as the ability to embed SSRS, OWC, SQL-based and other report types in dashboards; it supports writeback; it allows you to save comments in cells; and there’s also a blog and a wiki inside the portal.

Overall, it’s a solid, mature product that’s been around for six years and does pretty much everything you’ll want. Definitely looks like one to add to the shortlist if you’re looking for a zero-footprint SSAS client tool.

Month: January 2011

Solving the ‘Events in Progress’ problem in MDX, Part 2–Role Playing Measure Groups

Like this:

Solving the ‘Events in Progress’ problem in MDX, Part 1

Like this:

SQLBits 8 Registration Now Open

Like this:

Google Docs can access data from BigQuery

Like this:

PowerPivot Videos from SQL Server Day 2010

Like this:

Tuning the Currency Conversion Calculations created by the Add Business Intelligence Wizard

Like this:

PowerPivot Training in London (and elsewhere)

Like this:

Report Portal

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: