Update Isolation Level connection string property

I recently came across a new connection string property, Update Isolation Level, mentioned briefly in BOL but with so few accompanying details that I had no idea how it could be used. Here’s what is currently documented:

From http://technet.microsoft.com/en-us/library/ms145488.aspx :

Important:

When updated cells do not overlap, the Update Isolation Level connection string property can be used to enhance performance for UPDATE CUBE. For more information, see ConnectionString.

From http://msdn.microsoft.com/en-us/library/microsoft.analysisservices.adomdclient.adomdconnection.connectionstring.aspx :

Update Isolation Level

None

Specifies whether the cells in a single UPDATE CUBE statement are isolated from each other. The default is not isolated.

Luckily Akshai Mirchandani and Marius Dumitru from the dev team were able to provide me more details on it…

Basically, Update Isolation Level is useful when you are using writeback and you are writing values back to multiple cells within the same UPDATE CUBE statement. This point is important: you won’t see a benefit if you’re just writing back to one cell in your UPDATE CUBE statement, or if you’re issuing multiple single-cell UPDATE CUBE statements within a transaction. What it allows you to do is to say that each tuple within the SET clause in the UPDATE CUBE statement doesn’t overlap with any of the others; if that’s true, AS doesn’t need to do as much work at execution time and so the UPDATE CUBE statement runs faster.

Let’s take a look at some examples (I’ve used a simple cube I built based on the Adventure Works database, but it’s not Adventure Works). First of all, the following UPDATE CUBE statement contains two tuple assignments in the SET clause which overlap:

UPDATE CUBE [Sales]
SET
([Measures].[Sales Amount], [Order Date].[Date].&[20010701]
, [Product].[Product].&[1], [Customer].[Country – Customer].&[21383]) = 10,
([Measures].[Sales Amount]) = 10

The first tuple assigns the value 10 to Sales Amount for a particular Date, Product and Customer; the second assigns the same value to a tuple that contains just Sales Amount, and this tuple overlaps with the previous tuple because its value is the aggregation of all Sales Amounts for all Dates, Products and Customers. However, in the following example the two tuples do not overlap because they assign to Sales Amount for different Products:

UPDATE CUBE [Sales]
SET
([Measures].[Sales Amount], [Order Date].[Date].&[20010701]
, [Product].[Product].&[1], [Customer].[Country – Customer].&[21383]) = 2,
([Measures].[Sales Amount], [Order Date].[Date].&[20010701]
, [Product].[Product].&[486], [Customer].[Country – Customer].&[21383]) = 3

The Update Isolation Level connection string property has two possible values:

  • Update Isolation Level=1 means that each tuple update is guaranteed to be Isolated, ie the tuples don’t overlap
  • Update Isolation Level=2 means that each tuple update is not guaranteed to be Isolated

So, in theory, for the second example above if you set Update Isolation Level=1 in the connection string, the statement should run faster. I couldn’t see any differences in my limited testing but I’m assured that in more complex scenarios there are improvements – how much will depend on the amount of overlap, the structure of the cube and so on.

No BI Conference this year

Something else I’m a bit late on, having been away on holiday this week, is the announcement that there won’t be a Microsoft BI Conference this year:
http://blogs.msdn.com/bi/archive/2009/04/10/microsoft-bi-conference-update.aspx

Apparently the same customers who told Microsoft to kill PerformancePoint planning have also been saying that they should only hold the BI Conference every other year. As an aside, did these marketing people ever get told on their MBAs not to treat their customers like idiots? Is there anyone out there that still falls for that ‘our customers have told us’ line? Surely the whole point of blogging – even when it’s done by a marketing person – is that it’s meant to be (or should appear to be) an honest, person-to-person type of communication and doesn’t work when you simply reuse the same content you would have stuck in a press release?

Anyway, while I agree with many of the points Marco makes about a separate BI Conference being necessary, it makes my life easier in a way – it means I no longer have to make a decision about whether to attend the PASS Summit or the BI Conference.  PASS is the conference to attend and will have extra BI content; I’ve already submitted one abstract and will think of some more potential topics to talk about (any suggestions?), so with a bit of luck I’ll see you there.

SSAS2008 and Kerberos

Kerberos is notoriously difficult to set up – it’s something I’ve always left to someone else to do! Anyway, I’ve just come across a very detailed white paper from Boyan Penev and Mark Dasco about setting up Kerberos for Microsoft BI which I thought was worth linking to for future reference:
http://bp-msbi.blogspot.com/2009/04/enhanced-security-and-integration-of.html

However, I’ve heard from a number of sources (Dr John Tunnicliffe; Dan English) recently about problems with getting Kerberos to work with Windows 2008 Server or Vista, and this post from John Desch at MS confirms the bug – worse, it doesn’t seem to be fixed in SP1 (which got released this week):
http://blogs.msdn.com/psssql/archive/2009/04/03/errors-may-occur-after-configuring-analysis-services-to-use-kerberos-authentication-on-advanced-encryption-standard-aware-operating-systems.aspx

One to watch out for…

SQL Solutions OLAP Heartbeat and OLAP Performance Advisor

Hmm, you wait for years for commercial tools for monitoring Analysis Services (the only one I’d ever seen before was Companion for Analysis Services from SQLMinds) and then two come out at once. One of these tools I’ll be blogging about towards the end of this week, hopefully – I’ve had a sneak preview and it looks very cool – but today I found out the following from SQL Solutions:

I’ll download both and give them a thorough test as soon as I can.

EXISTING AND NONEMPTY

Here’s an interesting MDX nugget. The other day I was working on something similar to the following query, that finds the count of the number of customers who bought something in the current month as well as the previous month:

WITH
MEMBER MEASURES.TEST AS
COUNT(NONEMPTY(
NONEMPTY(
EXISTING [Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount])
, ([Measures].[Internet Sales Amount], [Date].[Calendar].CURRENTMEMBER.PREVMEMBER))
)
SELECT MEASURES.TEST
*
[Product].[Subcategory].[Subcategory].MEMBERS ON 0,
[Date].[Calendar Year].[Calendar Year].MEMBERS
*
[Customer].[Country].[Country].MEMBERS ON 1
FROM [Adventure Works]

On my laptop, this runs in almost 16 seconds – not good. What I was doing here was making the assumption that it was a good idea to do the EXISTING first, to filter the set of customers down as much as possible, then do the two NONEMPTYs. In fact, what I found was that by moving the EXISTING to after the NONEMPTYs had been evaluated the query ran much faster:

WITH
MEMBER MEASURES.TEST AS
COUNT(EXISTING 
NONEMPTY(
NONEMPTY(
[Customer].[Customer].[Customer].MEMBERS
, [Measures].[Internet Sales Amount])
, ([Measures].[Internet Sales Amount], [Date].[Calendar].CURRENTMEMBER.PREVMEMBER))
)
SELECT MEASURES.TEST
*
[Product].[Subcategory].[Subcategory].MEMBERS ON 0,
[Date].[Calendar Year].[Calendar Year].MEMBERS
*
[Customer].[Country].[Country].MEMBERS ON 1
FROM [Adventure Works]

On my laptop now, the above query executes in 3 seconds. Why is this? To be honest, I’m not completely sure… looking in Profiler and Perfmon I see differences, but not many; these things are better left to the likes of Mosha who know what’s going on inside in the formula engine. But anyway I thought I’d post this up because it’s a common form of calculation – so if you are using NonEmpty and Existing together, do the NonEmpty first and then the Existing.

SQLBits IV

So, yet another SQLBits finished, and judging by the feedback we got it was the best yet. Certainly we had our highest ever attendance and the sponsors all seemed very happy (which is important because, after all, they’re the ones paying for it); we’ve also had a number of very kind emails thanking us too. Now for a few months of peace and quiet before we start planning for the next one, which I suppose will be sometime in the autumn. Any suggestions on where to hold it? Bristol, Oxford, Leeds?

Incidentally, if you took any photos or videos at SQLBits why not share them on the SQLBits Facebook group? There’s also a SQLBits LinkedIn group and the #sqlbits tag on Twitter that you might want to check out too. Oh, and while you’re at it why not join the UK SQL Server User Group Facebook group as well? We’re planning some BI user group events soon, I promise.

I suspect the current economic problems have had a lot to with the success of SQLBits. Apart from the fact that a free conference is the perfect answer to a reduced training budget, I guess everyone’s worried about job security and therefore keen to invest in their own education – Simon Munro wrote a good blog entry on exactly this topic last week which is worth a read. However, I found it interesting talking to other BI professionals at the conference (and yes, networking is another great reason to come along) about their pipelines because everyone was saying how busy they were at the moment. When the recession hit I was sceptical of predictions that the BI sector would survive unscathed but, although I’ll admit I’ve not been as busy this year as I was last, I’m still doing more than OK. Looking at the visitor stats for this blog and my company website, I’ve noticed hits from a number of companies that have been in the news for their, ahem, financial problems – and the conclusion I draw is that there are a lot of organisations out their who have realised how little they know about their own financial situation and hope BI software is the answer.

Aleri Live OLAP

There’s no real reason for this post except to indulge my trainspotterish habit of making a note whenever I find another OLAP product that supports MDX. From a recent post on Curt Monash’s blog I came across Aleri Live OLAP (see also here), an MDX-capable OLAP server that works on streaming data coming from their complex event processing platform.

SDS, Analysis Services in the cloud, and local cubes?

I was just reading through a list of questions and answers about the new, more SQL Server-like SDS, on the SDS team blog and had a thought. Here’s three points that are made in the post:

When? or to quote JamieT “When do I get to party on this new service”?

We’re on track to deliver a public CTP mid-calendar year 2009 and ship in the second half of calendar year 2009.

The blog entry states “If it works with SQL Server, it will largely work with SQL Data Services.”. That word “largely” bothers me a little – it suggests the functionality is going to be reduced slightly. Details please?

We will be providing documentation soon on what is and is not supported in SDS. I’ll post an entry to the blog once the guidance is available and you can also keep an eye out for it on our MSDN Dev Center. But, to answer the question – We say *largely* due to the fact that there are things that just don’t apply in a cloud based world like setting the location of a data or log file or making server wide configuration changes. In v1 we expect to deliver a surface area that will support the vast majority of SQL Server database applications.

Will you offer hosted SSIS/SSAS/SSRS?

It’s on the product roadmap, but I can’t comment on specifics or timing.

So, we’ll get a CTP in a few months, it’s going to be mostly compatible with existing SQL Server apps, but we’re not going to get Analysis Services in the cloud just yet. What can we do while we’re waiting for cloud-based SSAS then? Well…

  • It seems highly likely that we’ll be able to hook a normal, server-based instance of SSAS up to SDS and use it as a data source for building cubes. It would be a pretty silly thing to do though, I’m sure, because it would take ages to process a cube, but…
  • Wouldn’t that make ROLAP a more attractive option as a storage mode? No processing needed then, just SQL queries generated whenever the data is needed. However, ROLAP is slow now and is likely to be even slower when you’re querying SDS, but…
  • For some OLAP apps, you could dispense with a server-based instance of SSAS altogether. One little-known (and little-used) feature of SSAS is the ability to build ROLAP local cubes. As you probably know, a local cube (.cub file) is a portable cube that doesn’t need full Analysis Services installed anywhere. Since storing a local cube file somewhere in the cloud would be dead easy, I can imagine a scenario where you create a ROLAP local cube file – which would be no bigger than a few KB in size – allow people to download it, and then when they connect to their cube from Excel or wherever the local cube would then in turn retrieve the data it needs from SDS. Not exactly SSAS in the cloud, and probably only likely to work for small amounts of data and simple queries, but it’s an approach and not unlike what CubeSlice has been offering for a few years.
  • With local cubes you could always convert them to MOLAP storage if you wanted faster query performance (at the expense of having longer processing times) and of course local cubes seem to be an important part of the Gemini story too. What we’d really need are easier ways to create local cubes and support for easy switching of storage modes (from ROLAP to MOLAP/Gemini) to make this as smooth as possible.

Query behaviour when you cross the LowMemoryLimit

Here’s something that had me stumped for quite a long time earlier this week…

I’m doing some performance tuning work for a customer at the moment, and following my usual approach I took a test query, cleared the cache, ran it and noted how long it ran for and what I saw in Profiler and MDX Studio. However this time I saw some strange things happening: I’d run the same query on a cold cache more than once and the query times would differ by anything up to a minute; even more confusing, the information I got from Profiler and MDX Studio would be different too – for example, the number of Query Subcube Verbose events in recorded in Profiler for a query might differ from run to run by several hundred. This had me completely confused. There was no-one else on the server and I was running on a cold cache. What was going on?

It turned out that I was running out of memory. I was working on a 32-bit box with a lot of databases on, so even on a cold cache the amount of memory used by SSAS was approaching the LowMemoryLimit threshold (see Marco’s post here, the AS2005 Performance Guide and this white paper for an explanation of what this is). When I ran the query – which had a lot of calculations – memory usage went above the LowMemoryLimit so the cleaner thread was waking up and evicting elements of shrinkable memory, which presumably included data in cache that had been put there by the current query. Since the exact moment that this threshold was crossed would vary from run to run, and as I assume that what the cleaner thread would evict from memory would also vary, this explained the different query times and Profiler activity. As soon as I moved onto a 64-bit box with no other databases this stopped happening.

Of course nowadays I would expect that most people are running on 64-bit so it’s much more unlikely that you’ll ever cross the LowMemoryLimit, but I do see it happen from time to time. So if you’re in the same position and suffer from inconsistent query performance, check your memory usage!

BI Survey 8

As I think I’ve probably said before, I don’t get many freebies as a blogger but one that I do get and I really appreciate is my review copy of the BI Survey. It being that time of year, I got the latest edition – the BI Survey 8 – last week along with an email encouraging me to blog about it, and who am I to refuse a request to blog about something as fascinating as this? It’s a whopping 489 pages long so I can’t even begin to summarise it, but there are some points regarding Microsoft BI that I’d like to pick up on.

In general, the findings show the Microsoft BI stack as a solid and successful suite, extremely good value for money, but by no means a stellar performer. There’s a long section at the end of the report showing a wide range of technical and project-related KPI ratings (for example query performance, business benefits achieved, quality of support, cost of deployment and so on) and in almost every respect Analysis Services and Reporting Services come out in the middle of the rankings. This doesn’t really surprise me much: let’s face it, what we’ve got works well, but from a technical perspective there’s not been much new and exciting in the world of Microsoft BI for a while now (although with Gemini and Madison coming soon that will change), the platform in general still has some glaring holes and overlaps, and as the PerformancePoint debacle showed recently Microsoft’s overall BI strategy is somewhat confused. In fact, I suspect Microsoft’s entire BI strategy is not a BI strategy at all but a get-people-to-upgrade-to-the-latest-version-of-Office strategy, but I digress…

For Analysis Services in particular, the survey showed 15% of those surveyed were on AS2K, 79.6% were on AS2005 and 3.5% on AS2008. Given that most people must have been surveyed last summer, well before the RTM of AS2008 that’s pretty good and with the migration path from 2005 to 2008 very smooth I should think the AS2008 percentage will look much better next year. Analysis Services remains the top-ranked BI tool used against Microsoft’s own databases, as you’d expect since it’s essentially free with SQL Server, but it also comes in a close third for Oracle (while Oracle’s own BI tools come in 12th!), second for IBM, third for Teradata and top for Open Source databases.

In terms of the client tools used with Analysis Services, just over 70% of people were using one of Microsoft’s own tools – mostly pivot tables and Reporting Services. 25% of SSAS users had Proclarity, and interestingly there were more people using the old free Excel addin than PerformancePoint , and a sizeable minority still using Data Analyzer. There’s clearly a lot of demand for a client tool from Microsoft but from the looks of things most people are still stuck with Office 2003; this just adds weight to my argument that Microsoft coupling its BI strategy so closely to Office might help Office adoption rates but has a serious negative impact on the success of BI strategy itself… sorry… there I go again… As far as third-party tools go Panorama is still the #1 vendor, but it only has an 11% share and only 22% of SSAS users overall had any kind of third party tool in use. Incidentally Panorama was treated as a BI vendor in its own right for the purposes of the survey and came out top in a lot of the high-level KPIs including the overall KPI ranking – they obviously have some very enthusiastic customers.

So who, apart from Panorama, seems to be doing well? Qlikview certainly is, which at least validates Microsoft’s decision to go after that market with Gemini; Microstrategy does well too. Essbase comes out badly with great query performance offset by poor support and product quality; Cognos, apart from TM1, doesn’t seem to do too well either. Clearly it’s the vendors who are 100% focused on BI that are the most successful, which is as you’d expect.