SQLBits 7 Registration Open

If you haven’t heard already (and I’m at least twelve hours late on this news) then the big news is that registration has opened for SQLBits 7. The conference will be taking place from Thursday September 30th to Saturday October 2nd in York, in the UK; to find out more about what’s going on, and to register, go to http://www.sqlbits.com/. At the time of writing 116 people have registered already, and we haven’t even published the agendas for the Friday or Saturday yet…!

I’m going to be running one of the training days that will be taking place on Thursday 30th too: a whole day on performance tuning Analysis Services 2008. I’ll be covering designing aggregations, partitioning, caching and cache warming, tuning queries and calculations, and hardware configuration; basically it will be a brain dump of everything I know on the subject and at a very modest price too…

The Deployment Wizard, and when “Retain Partitions” doesn’t retain partitions

I learned something the hard way this week about how the Deployment Wizard actually works (thank goodness for daily backups) that I thought I’d share.

I’ve used the Deployment Wizard lots of times to deploy changes to a SSAS database in production. If you have a system where new partitions are generated dynamically, for example in an SSIS package that creates new partitions automatically when new data is loaded into the cube, then these partitions are not going to be present in your project in BIDS and you don’t want them to be overwritten when you deploy the new version of your database. The Deployment Wizard allows you to stop this happening, and also to not overwrite other setting that might be different in dev and prod, such as connection strings and security roles.

However, what I didn’t realise was that in order for the Deployment Wizard to work properly in all cases it needs to be able to see the server you want to deploy to, and in my current customer’s environment you can’t see prod from dev. Working on dev, what I did was to edit the .deploymenttargets input file manually to enter the name of the database in prod I wanted to deploy to, then ran the wizard with the /a argument (see here for more information on these settings) to set other options in the input file such as Retain Partitions, then ran the wizard again with the /o and /d argument to generate the xmla deployment script without needing to connect to the target server (which I couldn’t, of course, see). And then, when I ran the resulting script in prod, I saw all the partitions that weren’t in dev disappear even though I’d selected the Retain Partitions option.

What I had assumed was that by selecting the Retain Partitions option the wizard would somehow generate the XMLA Alter command it outputs so that it would just ignore the partitions in each measure group. However this is not correct: you can’t have an Alter command for a database that doesn’t list all the partitions in all the measure groups in the database. This is why the wizard needs to connect to the target server: it scripts out all the partitions that currently exist on the target and inserts them into the script it generates, instead of just the partitions that are in the .asdatabase file. If, though, you use the /d argument you’re telling the wizard to use only the information that’s in the input files and not to connect to the target server to see what’s there, so it can’t know what partitions are present on the target server and it has to use the partitions from the .asdatabase file instead. So, if you use the /d argument, even if you specify the Retain Partitions option you may see partitions deleted on the target when you run the wizard’s script.

MDX Course in Dublin

Although I’ve mentioned it before, let me plug it once again: I’ll be teaching a public MDX course in Dublin on September 16th and 17th. More details and how to register can be found here:
http://www.prodata.ie/Events/MDX2010.aspx

Building a Better Cache-Warmer, Part 2: The Formula Engine Cache

Sorry for the very long delay – at long last, here’s the companion piece to the post I wrote last September about cache-warming. That post dealt with the relatively straightforward topic of warming the Storage Engine cache; this time we’ll be looking at how to warm the Formula Engine cache. As I promised in the first post, the idea is that I’m going to take the ideas discussed here and use them to create a better cache warmer than the ones I’ve blogged about in the past.

The first thing to note when talking about caching the result of MDX calculations is that it does happen – a surprising number of people think that Analysis Services performs every MDX calculation needed by a query from scratch every time. This does indeed happen in worst-case scenarios – and it’s all too easy to do things which accidentally get you into these worst-case scenarios – but when SSAS can cache the result of calculations it can have a massive impact on query performance.

The second thing to note is that when I say that SSAS can cache the result of calculations, what I mean is that it can cache the values returned by cells in the cube which have been subject to some form of MDX calculation. Analysis Services can’t cache the result of MDX expressions as such, it can only do so when the result of those expressions is surfaced as a value returned by a cell somewhere in the cube. This means that if we create a calculated measure on the cube’s MDX Script, and then include this calculated measure in a query, we can expect that SSAS should be able to cache the values returned by this calculated measure. On the other hand if we have a complex MDX set expression on the Rows axis of our query, the set that expression returns cannot be cached – it will be re-evaluated every time the query is run.

The third, and possibly most important thing to remember is that SSAS can only cache the results of a calculation for as long as the calculation actually exists. As you may already know, there are three places you can create a calculated member in SSAS, and these equate to three different ‘lifetimes’. If you create a calculated measure on the MDX Script of the cube it will live until the next time the cube’s MDX Script is dropped and recreated, and this will happen if you explicitly clear the cache or you do any kind of processing. This is called ‘global’ scope. If, however, you create a calculated member in the WITH clause of a query then that calculated member will only exist for the lifetime of that query – so while you will benefit from caching while the query is executing, after the query has returned the calculated member will disappear and so will any related cache. This is called ‘query’ scope. The third scope, or lifetime, for creating a calculated member is ‘session’ scope: you can create calculated members that live for the lifetime of a session using the CREATE MEMBER statement, so values can only be cached for one user and one session; this is used only very rarely though.

The fourth to understand is that SSAS can’t cache the results of some calculations, or if it can cache them for one user it cannot share the contents of this cache with other users. For example this will happen if the MDX calculation might return a different result every time it runs (eg if it used the Now() function to return the system date and time – which is obviously going to be different each time it’s called) or if it might return different results for different users (eg when two different users run the same query through different security roles, because they might be able to see different parts of the cube the values returned might be different). The way in which a query is constructed can also force ‘query’ scoping for the formula engine cache too: the presence of a calculated member in the WITH clause can do this, as can the use of subselects. This also means that FE cache warming is pointless for certain popular client tools like SSRS or Excel because the MDX they generate prevents the use of the FE cache.

Finally, the Formula Engine has two different structures it uses to hold cached values. For calculations that execute in bulk mode then it can use the same structure that the Storage Engine cache uses, namely the data cache registry. For calculations that execute in cell-by-cell mode, however, it uses a different structure called the flat cache. Calculations that execute in bulk mode in most cases cannot make use of values stored in the flat cache, and calculations that execute in cell-by-cell mode cannot make use of values stored in the data cache registry. Furthermore, the size of the flat cache is restricted to 10% of the TotalMemoryLimit server property; if it grows bigger than that it will be completely emptied.

(Incidentally, if you’re looking for more detail on any of the above points, I suggest you watch the video of my session on “Cache Warming Strategies for SSAS 2008” here).

So clearly there are many potential pitfalls to watch out for when warming the FE cache, and indeed in many cases I’d say that it’s just easier to concentrate on warming the SE cache and tuning your calculations so they execute as fast as possible even on a cold FE cache!

If you do need to warm the FE cache though, you need to be very careful. Unfortunately the CREATE CACHE statement mentioned in the previous post only works for the SE cache, so the only way to warm the FE cache is to use MDX queries. These queries should be hand-written specifically for the purpose – if you simply record queries run in production using Profiler, it’s likely you’ll end up with queries that don’t warm the FE cache because they contain a WITH clause or a subselect – but it’s nonetheless a good idea to use production queries as a starting point and then modify them so they become FE-friendly. A smaller number of larger queries is going to be better than a large number of queries that return small amounts of data, to avoid cache fragmentation in the data cache registry, and it’s a good idea to keep them as simple (and as non-arbitrary-shaped) as possible.

Since it’s likely that the calculations you’re most interested in caching are going to be the ones that, despite your best efforts at tuning them, execute in cell-by-cell mode, then it’s a good idea to keep an eye on the overall size of the various flat caches in existence. The following DMV, I think, shows the size of all the flat caches across all the databases and for every session on an SSAS instance:

select * from
$system.discover_object_memory_usage
where object_id=’FormulaCache’

…but I need to do a bit more research to make sure it shows what I think it shows and to make sure I can observe this 10% limit resulting in the flat cache being emptied.

Lastly, as with the SE cache, you don’t want to overfill it and leave no memory for the natural growth in cache that will occur as a result of normal querying; and you certainly don’t want to go over the LowMemoryLimit or TotalMemoryLimit while you’re warming the cache and end up with cache evictions taking place.

Speaking at the PASS Summit 2010

Hurray! I’ve been selected to speak at this year’s PASS Summit in Seattle:
http://sqlpass.eventpoint.com/topic/details/BIA412

Looks like another strong lineup of speakers again; I hope they do a better job of scheduling the sessions though…

…and don’t forget, if you’re in the UK (or indeed Europe) and your budget doesn’t stretch to a flight to Seattle, there’s always SQLBits! Just look at the sessions we’ve had submitted so far:
http://www.sqlbits.com/information/PublicSessions.aspx

The Excel Web App and its missing API

A few weeks ago I was playing around with the Excel Web App and thinking that, while it’s better than the Google equivalent in some ways, the lack of an API seriously limited its usefulness. So I posted a question on the Excel Web App forum asking what the plans for an API were and got the following answer:

Currently there are no APIs exposed for Excel Web App, and we are not sure if this will be available in the future.

This was more than a little disappointing, to say the least… So I replied with reasons why an API would be a good idea and got Jamie to join in (he and I have very similar views on this type of subject) as well. You can read the whole thread here:
http://social.answers.microsoft.com/Forums/en-US/officewebapps/thread/eaa565f1-83b6-4e7c-a1ec-af31fc3d7a73

In summary, what I’d like to see is the Excel Web App be able to do the following:

  • Consume data from multiple data source types, such as OData, and display that data in a table
  • Expose the data in a range or a table as an OData feed

I think it would enable all kinds of interesting scenarios where you need data to be both human-readable and also, at the same time, machine-readable: for example, imagine being able to publish data in a spreadsheet online and then have your business partners consume that data in PowerPivot at the click of a button. The posts on the thread go into a lot more detail so I’d encourage you to read it; also Jon Udell picked up the issue and blogged about it here:
http://blog.jonudell.net/2010/06/30/web-spreadsheets-for-humans-and-machines/ 

And now I need your help: the Excel Web App dev team asked for specific scenarios where an API would prove useful and both Jamie and I provided some, but I think the more we get (and the more people show that they want an API) the better. So if you’ve got some ideas on how you would use an API for the Excel Web App then please post them on the thread! The more noise we make, the more likely it is we can change the dev team’s mind.

Silverlight PivotViewer Control released

The Silverlight PivotViewer Control (as Live Labs Pivot is now officially called) has just been officially released. There’s loads of great content on the website here:
http://www.silverlight.net/learn/pivotviewer/
There’s also a good post on how to use the Pivot Collection Tool for Excel here:
http://whitneyweaver.com/post/A-Simple-Pivot-Viewer-Example.aspx
No sign of that tool for creating collections from SSRS that was demoed at TechEd yet, though.

It is a truly beautiful piece of software and puts to shame all of Microsoft’s previous attempts at BI client tools, although of course it doesn’t actually integrate with any of the rest of Microsoft’s BI stack (I’ve asked a question on the PivotViewer forum about whether there are any plans to fix this here – it really needs to happen). It’s also proves something I’ve said on this blog several times over the years: that the lessons learned in the business intelligence world for visualising and analysing large data sets could bring many benefits to the world of search. Look at this real example of how the PivotViewer control can be used to search for wedding venues in the UK, for instance:
http://www.hitched.co.uk/wedding-venues/visual-search.htm

And wouldn’t it be cool if you could use it to browse through the contents of your file system in the way I showed with Excel and PowerPivot recently?

IBAX SSAS Value Pack

Before I start, I need to declare that since Michael Brönnimann (one of the guys at IBAX behind this product) is a friend of mine I can’t really be considered an unbiased reviewer of this product; however, since I think it’s interesting and deserves wider exposure, I did think it was worth blogging about. In fact I’ve more or less given up reviewing products and books on this blog because I know so many people in the SSAS world that it makes impartiality on my part very difficult; it’s just easier to flag up interesting stuff and not try to make any kind of critical assessment.

Anyway, back to the matter in hand: a look at the IBAX SSAS Value Pack. The product’s web page is here:
http://www.ibax.ch/-IBX-/plain.simpleimagetitletextlinklist.en.solutions.products.valuepack/default.aspx
Basically, it’s a template Analysis Services solution that contains solutions to many cube design and MDX calculation problems that you can adapt for your own project. The idea is that, rather than building everything yourself from scratch, if you bring your own data into an existing solution you’ll be able to build something that’s very feature-rich very quickly without having to spend too much time worrying about the design and implementation of the difficult stuff. And the template itself certainly is very feature rich: it has a full set of time series calculations, many other financial and analytical calculations, different types of currency conversion, showing measures in different scales, formatting and colour highlighting, actions and drillthrough.

The next question is, of course, will you actually get any value from a product like this (assuming you don’t buy it as a service offering)? I mean, if you don’t know how to implement this stuff in the first place are you going to be able to adapt someone else’s code to your own requirements? Well, there’s a lot of documentation and some introductory training included in the package to help get you up to speed; and I think if you’re going to learn how to learn SSAS it always helps to have some worked examples to try to understand; and even if you throw away 80% of what you get, the remaining 20% will have still proved useful. If the alternative is to spend months and months developing something, and then finding you’ve made some fundamental mistake in the design early on that you can’t now change, then it’s got to be better.

Arguably Microsoft should be providing something like this to its customers and partners; there have been various service-offerings and solution accelerators developed for SSAS over the years but nothing much has ever come of them. The Add Business Intelligence Wizard in BIDS does something similar too, but again that never seemed to be a development priority and it suffered in the earliest releases of SSAS 2005 from generating MDX that didn’t follow best-practices or, worse, just didn’t work. Adventure Works is probably the closest comparison, and although it does a good job as a sample database it’s not really a template for an enterprise solution. So it’s up to third parties like IBAX to fill the gap in the market…

SQLBits 7 Dates Announced

It’s SQLBits time again! Here’s our press release with all the details:

The SQLBits committee is pleased to announce that the next SQLBits conference, SQLBits 7, will be taking place from September 30th to October 2nd at York University. SQLBits is the largest SQL Server conference in Europe and the previous six conferences have established it as a must-attend event for DBAs, developers and BI professionals who work with Microsoft SQL Server in the UK and Europe. This SQLBits will be the biggest ever, with capacity for over 500 attendees.

This time SQLBits will be running the multi-day format that was so successful in Manchester and South Wales last year. Thursday 30th will be a training day, featuring in-depth, full-day seminars delivered by a number of well-known SQL Server professionals; Friday 1st will be a deep-dive conference day with advanced sessions delivered by the best speakers from the SQL Server community; and Saturday 2nd will be the traditional SQLBits community conference day, with a wide range of sessions covered all aspects of SQL Server at all levels of ability. There will be a charge to attend days one and two, but day three will, as usual, be completely free to ensure everyone can have the opportunity to attend. The conference will be held at the Heslington Campus of York University, a modern venue easily accessible from all parts of the country by road and rail.

Session submission is open now and we are looking for presentations on the seven main parts of the SQL Server toolset: the SQL Server relational database engine and TSQL, Integration Services, Reporting Services, Analysis Services/PowerPivot, Master Data Services, StreamInsight and Parallel Data Warehouse. As a community organisation we are keen to have as many people involved as possible, so we welcome submissions from new or inexperienced speakers. To submit a session, please go to http://sqlbits.com/information/SessionSubmission.aspx.

If you have a SQL Server-related product or service, SQLBits is a unique way of promoting it to hundreds of SQL Server professionals and decision makers. We can tailor sponsorship packages for SQLBits to your specific business needs; anyone interested in sponsoring the conference can get more details by emailing us at contactus@sqlbits.com.

Just to sum up – speakers and sponsors, we need you!

2008 R2 Best Practices Analyzer released

The 2008 R2 version of Best Practices Analyzer has just been released – you can download it here:

http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=0fd439d7-4bff-4df7-a52f-9a1be8725591

I’ve not looked at it yet (it was only released a few hours ago) but I’m told it features an updated set of rules for SSAS. Running the BPA over your existing cube is a good way of spotting bad things you might have done by accident or because you didn’t know they were bad.