Analysis Services and Solid State Disks

There’s a lot of discussion going on in the SQL Server relational world about solid state disks and their benefits, but for some reason very little has been said about how useful they could be for Analysis Services. And yet, with their impressive performance for random reads, SSAS and SSDs would seem to be the perfect combination. I think the reason for the silence is that there’s less general understanding of SSAS performance issues out there in the blogosphere, and probably also since there are fewer really large installations of SSAS not many people have had the chance to test SSDs with SSAS properly.

Anyway, the other week I was lucky enough to spend some time testing SSDs with SSAS for a customer of mine. For reasons of privacy I can’t share any significant details about what I found, and before I go on I need to point out that what I was doing was comparing the performance of a particular SSAS cube on a particular existing hard disk configuration with the same SSAS cube on a particular SSD configuration. So while it’s certainly possible to say that, in lab tests, SSDs can perform extremely well for certain tasks, in the real world you have to ask the following question: what benefit will an SSD give me for my cube, over my existing disk configuration? So in fact even if I could share the details of my tests they wouldn’t be all that useful.

That said, here are a few high-level findings:

  • Backup on the SSD was approximately 50% faster than on their existing, very respectable, hard disk configuration; Restore was 65% faster. These were the clearest examples of the SSD showing massive performance improvements.
  • For various reasons I couldn’t do full tests on processing performance, but Process Index operations were about 10-15% faster, as was Synchronisation.
  • Performance of individual queries was not significantly faster, maybe 1-2%; there were some cases where, with large numbers of concurrent users, I saw a bigger improvement of around maybe 10%. This didn’t surprise me at all because I knew that any query performance issues the customer has are related to the formula engine and not the storage engine, so IO wasn’t much of a problem in the first place.

These tests also led to some interesting behind-the-scenes discussions about how you can actually measure the impact of IO on SSAS storage engine operations, which Teo has already blogged about here and which are well worth reading:
http://prologika.com/CS/blogs/blog/archive/2010/04/13/analysis-services-i-o-time.aspx
http://prologika.com/CS/blogs/blog/archive/2010/04/16/using-xperf-to-test-analysis-services.aspx

For this particular customer, my recommendation was that it wasn’t worth the time, money and effort to move to SSDs, at least from a query performance point of view. In fact, based on several years experience of tuning SSAS implementations I find that in most cases IO is not one of the most important factors in determining query performance – it’s more often the case that the problem lies in the formula engine, and even when it doesn’t, an appropriate partitioning strategy and aggregation design can work wonders.

However, let me be clear: I am not making some general statement that SSDs are not useful for SSAS. What I am doing is giving the usual ‘your mileage may vary’ and ‘it depends’ answer. There are definitely going to be SSAS implementations where SSDs will make a massive difference to query performance, and where no other tuning technique can have the same impact. For example, I’ve heard that with distinct count measures (especially when there are a large number of concurrent users) SSDs can lead to very impressive improvements in query performance; I would assume that where there are large many-to-many relationships you would also benefit accordingly. What you need to do before you spend money on SSDs is to understand the causes of any performance issues you currently have and make sure you tune your cube to the best of you abilities.

Has anyone else had any experience with SSAS and SSDs they’d like to share?

UPDATE: the SQLCat team have published this technical note on the subject: http://sqlcat.com/technicalnotes/archive/2010/09/20/analysis-services-distinct-count-optimization-using-solid-state-devices.aspx

7 thoughts on “Analysis Services and Solid State Disks

  1. … What you need to do before you spend money on SSDs is to understand the causes of any performance issues you currently have and make sure you tune your cube to the best of you abilities…. Could that translate to \’Have the talented Chris Webb take a look first and see if it you can benefit from more conventional tuning first\’? 😀

  2. Hi Chris,We have tested and gone into production with HP branded Fusion IO SSDs (which, I believe, are a completely different story than "conventional" SSDs – not sure what you tested) and overall, we are very pleased when comparing to our (more expensive) 6 x 72 GB 15k Raid 1+0 setup. We captured one week’s worth of MDX workload (6000 statements) and replayed it against both a comparable server with the new SSD disk and our otherwise passive Analysis Services machine using conventional storage. Some highlights: the top 10 % slowest queries were answered approx. 4x faster. The 90% fastest queries 2x faster. Twice as many queries were answered in duration = 0. We replayed trace in Profiler using 4 concurrent threads. The potential gain is greater because the bottleneck became CPU bound – also because the Fusion IO SSDs are more CPU intensive for unknown reasons.Surely, things will look different in other environments, but I\’m pretty sure that this type of SSD will outperform conventional storage in most scenarios. After som correspondance with the SQL CAT team, I also learned that SSDs are most likely the new recommended storage methoed for SSAS (rather than SANs, which was the old recommendation).I\’d be happy to share more findings and the correspondance by email if you\’re interested.

  3. Hey Chris,To add to Martin\’s point – we (SQLCAT) have a draft whitepaper of using SSDs with SSAS in conjunction with the XBox team that will be coming out in the next month or so. Similar to Martin\’s scenario – we were also using Fusion I/Os. I do agree with most of your above statements but would amend:- SSDs are very beneficial when working with distinct count queries and/or any that involve heavy storage engine usage, even for single user queries.- This becomes even more apparent when dealing with concurrent user queries when the SE is pretty heavily used- As systems becomes larger and costs of SSDs become cheaper, going down this path for SE heavy queries becomes even more beneficial: query and cost- We also have a super early draft describing how we can do scale out using the read-only feature on SSDs as wellHTH!Denny

  4. I have a client that has seen some pretty good performance with solid state drives. They do have quite a bit of distinct counts, lots of partitions, and the users pull lots of data into Excel. I think the drives help the reads off of the numerous paritions and the cache. I think it is worth looking into if performance is an issue.

  5. I\’ve had some private feedback from other readers this week to the effect that distinct count measures show an incredible increase in performance with SSDs, so if you do have distinct count measures they sound like a must-have.

Leave a Reply to ChrisCancel reply