There’s an interesting thread on the Analysis Services MSDN Forum that I’ve been following for a while now (it’s over 40 posts long at the time of writing) that gives a good explanation of why metadata access on very large or complex cubes can be so painfully slow. Here’s the link:
In summary, it turns out that the more files there are on disk in your SSAS database folder, the slower metadata reads become, and when you reach very large numbers of files almost any kind of metadata access becomes incredibly slow. There are a lot of different factors determining how many files are created for your SSAS database but large numbers of cubes, attributes, partitions and aggregations (which end up being multiplied across each partition) are the main culprits; the problem is that the way the EstimatedSize property on metadata objects has been implemented results in SSAS having to iterate over all the files in every subdirectory in the database’s directory in the data folder to find their size.
One workaround is to set the Storage Location property of your partitions to somewhere outside the data directory since this means they won’t get scanned, but doing this makes backup and restore difficult and means you get inaccurate size estimates. The only other option is to reduce the number of partitions and aggregations you have; given that the SQLCat team are now saying that partitions of 50-60 million rows each are feasible in 2008, and that you can use different partition designs for partitions containing old and new data, there’s probably a lot of scope for doing this in most cubes. Hopefully, though, this problem will be fixed in a future release…
We currently have the following and are wondering whether it is the metadata being queried when partitions are being added?
We have a Cube with around 2k Partitions. Each day another 1500k more are added before they get merged down. It seems to take an extremely long time to add these partitions 10mins + in which time access to the Cube is very restricted.
We have spent much time ourselves investigating this and have no success. CPU, IO and Memory utilisation is very low whilst the partitions are being added so it doesn’t look like a HW issue.
Yes, it sounds like you’re running into this problem. It also sounds like you have a large number of partitions; how much data do you have in each partition? I know I don’t have all the details, but it sounds like you have too many partitions, and I’m not sure merging large numbers of partitions is a good idea either…
Hi Chris,
Thanks very much for responding.
The volume naturally compares from measure group to measure group. The larger one have around 16m rows and vary in size greatly 1GB+
Is there a definitive way I can prove that metadata is the issue? I’ve been trying different counters in perfmon and although some numbers look high I’m not sure what to look for regarding this issue?
I will get back to you with a description as to why we have so many partitions. I’m pretty sure it was a conscious design decision. Right or wrong ☺
Please can you expand on your comment ‘I’m not sure merging large numbers of partitions is a good idea either…’
Thanks again.
Jamie.
The other question I should have asked is which version of SSAS you’re running – I can’t remember, but I’m pretty sure that this issue was fixed in SSAS 2008 R2. If you’re running on R2 or 2012 and still having a problem then it’s probably not this issue. I don’t know of a way to be sure how you can prove you are running into this problem though.
I believe that if you regularly merge partitions, then over time the performance of the merged partitions degrades unless you occasionally do a Full Process on those partitions. Are you doing this?