Build Your Own Analysis Services Cache-Warmer in Integration Services

Chris Webb Analysis Services January 2, 2007August 16, 2010 4 Minutes

Cache-warming is one of the most neglected performance-tuning techniques for Analysis Services: perhaps it seems too much like cheating? Yet almost everyone knows how much difference there can be executing a query on a cold cache and a warm cache so there should be no excuse not to be doing it, especially if you know what queries your users are likely to be running in advance. AS2005’s caching mechanism is more complex than I can describe here (or than I can describe full stop – although I hear that the recently published "Microsoft Analysis Services 2005" has some great information on this front) but a lot of the time it can cache raw data of the cube and quite often the results of calculations too; you’ll need to test your own cubes and queries to find out exactly how much you’ll benefit but almost every cube benefits to a noticeable extent.

I’ve recently implemented a simple cache-warming system for a few customers which I thought I’d share details of. Now I know that the documentation for asmd contains details of how you can use it for this purpose (see http://msdn2.microsoft.com/en-us/library/ms365187.aspx for details) but I didn’t go down this route for a number of reasons:

This example uses a batch file and I preferred to keep all my logic in SSIS, especially since the customers were already using it for cube processing.
I wanted to avoid making my customers have to get their hands dirty extracting the MDX needed. They were using Excel 2003 as their main client and as you may know Excel 2003 makes heavy use of session sets so extracting and modifying the MDX it generates to create single MDX statements would have been too much to ask.

Here’s what I did instead. First, I created a new SQL Server database to store all the queries I was going to use. Then I used Profiler to capture the necessary MDX: I made sure no-one else was connected to the server, started a trace which only used the QueryBegin event and which included the TextData column, got the user to open Excel and construct and run their query, then stopped the trace and saved the results to a table in my new database. After doing this a few times I ended up with several tables, each of which contained the MDX generated for a particular sequence of queries in Excel.

Next I created a SSIS package which took each of these queries and executed them. Here’s what it looked like:

The outermost ForEach container used an SMO enumerator to loop through every table in my SQL Server database and put the table name in a variable (the expression generated by the UI was Database[@Name=’CacheWarmer’]/SMOEnumObj[@Name=’Tables’]/SMOEnumType[@Name=’Names’]). Next a script task used this table name to create a SQL SELECT statement which returned every query in the current table and put that in another variable. Here’s the code:

Dts.Variables("GetMDXQueries").Value = "SELECT textdata from [" + Dts.Variables("TableName").Value.ToString() + "] where DatabaseName=’" + Dts.Variables("ASDatabaseName").Value.ToString() + "’"

Next I used an Execute SQL task to execute this statement and out the resultset into another variable, the rows of which I looped over using the innermost ForEach loop using an ADO enumerator. Inside this second loop I got the MDX query out of the current row and into a string variable in a Script task as follows:

Dts.Variables("MDXQueryString").Value = Dts.Variables("MDXQueryObject").Value.ToString()

Then used another Execute SQL task, connected to my cube, to run the MDX query. I’ve been unable to execute MDX queries inside a Data Flow task (except when going via SQL Server using linked servers, which is nasty), hence the use of an Execute SQL task here; I also found that I had to use an ADO connection to my cube – if I used an OLE DB connection all my queries ran twice for some reason. I also set the RetainSameConnection property on the connection to the cube to true so that queries which relied on session scoped sets created earlier in the workflow didn’t fail; nonetheless I also set the FailPackageOnFailure and FailParentOnFailure properties of the Execute SQL task to false just in case. I was then able to save the package up to my server and use SQL Server Agent to execute it immediately after cube processing had finished.

As I said, if you implement a cache-warming system you’ll want to test how much of a difference it makes to your query performance. The easiest way to do this is to clear the cache and then run the package twice, noting how long it takes to run both times. The difference between the two times is the difference between a cold and a warm cache. To clear the cache you can either restart the Analysis Services service or run a Clear Cache command in an XMLA query window in SQLMS. Here’s an example of the latter which clears the cache of the Adventure Works database:
<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<ClearCache>
<Object>
<DatabaseID>Adventure Works DW</DatabaseID>
</Object>
</ClearCache>
</Batch>

Now I will admit that I’m not the world’s greatest SSIS expert so if anyone has any suggestions for improving this I’d be pleased to hear them. Please test them first though – as I mentioned above I found SSIS didn’t always work as I expected with SSAS as a data source! I’ve also created a similar package whih connects to the query log AS creates for usage-based optimisation, reads the data in there and uses it to construct MDX queries which it then runs against the cube. This has the advantage of removing the need for anyone to extract any MDX from anywhere; plus the queries it constructs return very large amounts of data so you can use up all that memory you get on 64-bit boxes. The problem is that at the moment some of the queries it constructs are way too big and take forever to run… when I’ve worked out how I want to break them up into smaller chunks I’ll blog about it.

UPDATE: Allan Mitchell has very kindly done some more research on what happens when you try to run an MDX query through an Execute SQL task and written it up here:
http://wiki.sqlis.com/default.aspx/SQLISWiki/ExecuteSQLTaskExecutesMDXQueryMoreThanOnce.html

Published by Chris Webb

My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel. View all posts by Chris Webb

Published January 2, 2007August 16, 2010

60 thoughts on “Build Your Own Analysis Services Cache-Warmer in Integration Services”

Niels says:

July 9, 2008 at 1:09 pm

Hi Chris

Greate blog – I am fairly new to SSIS and "Cache warning" and I am having a hard time getting the SSIS package to work as you describe.
Would you been kind and upload your SSIS package so I can have a look at it and get mine to work

Thanks in advance
Niels

Loading...

Reply
Chris says:

July 9, 2008 at 5:46 pm

I mailed you the package using the address you left here – hope you get it…

Loading...

Reply
Unknown says:

October 15, 2008 at 7:59 pm

i am looking for a create cache tables,

please Would you been kind and upload your SSIS package so I can have a look at it and get mine to work

Loading...

Reply
Unknown says:

October 15, 2008 at 8:11 pm

hi Chris ,

please , Could send me SSIS package which you explained above ….

thanks
mutylala

Loading...

Reply
Chris says:

October 16, 2008 at 10:58 am

Mutylala, can you send me your email address? You\’ll find my contact details at http://www.crossjoin.co.uk

Loading...

Reply
Imran says:

October 24, 2008 at 3:51 pm

Can u please forward me the package? This is urgent. Tks

Imran

Loading...

Reply
George says:

November 10, 2008 at 1:03 am

Hi Chris

Great Stuff! Could you please send the SSIS package to me? I\’ve been looking for this topic for a while and it is so nice to read your article.

Thanks in advance
George

Loading...

Reply
rasheed says:

December 4, 2008 at 11:01 am

Great.. the package work fine for query that return small sets of data. it complain about XML and resultset when the data is averagely large, pls any help..

Loading...

Reply
hung says:

January 13, 2009 at 1:58 am

Hi Chris,Does this mechanism work with AS for Project server 2003 ?If Yes, then would you please send me the package?ThanksHung

Loading...

Reply
Chris says:

January 13, 2009 at 12:51 pm

I assume that Project Server 2003 runs off Analysis Services 2000, right? If so, the general approach should work but the package might need some modification; the package is SQL2005 only.

Loading...

Reply
Jim says:

February 16, 2009 at 3:33 pm

Hi Chris,Could you also email the SSIS package to me please?Thanks!

Loading...

Reply
Chris says:

February 16, 2009 at 4:28 pm

I will if you send me your email address! You can contact me via http://www.crossjoin.co.uk

Loading...

Reply
Unknown says:

April 30, 2009 at 9:45 pm

hi Chris,I tried a different approach for cache warming. In my package I have a execute sql task which gets the mdx queries from a trace table in the database and another execute sql task which is with in a for each loop container executes it.But the problem is the query captured by the profiler is in this format.SELECT NON EMPTY [{D84A78E5-5A54-4A9D-B945-41A1D370B5DE}Pivot30Axis1Set0] DIMENSION PROPERTIES MEMBER_NAME, PARENT_UNIQUE_NAME ON COLUMNSFROM [CUSTOMER CUBE] CELL PROPERTIES VALUE, FORMATTED_VALUE, FORE_COLOR, BACK_COLORwhen i run the package it gives me an error saying can not run this query.Can you please advice why it\’s happening?

Loading...

Reply
Unknown says:

April 30, 2009 at 9:46 pm

Chris, Can you please email me your SSIS package?thanks

Loading...

Reply
Chris says:

May 1, 2009 at 9:59 am

I can if you tell me what your email address is! Alternatively, you can find my contact details at http://www.crossjoin.co.uk

Loading...

Reply
Unknown says:

May 1, 2009 at 3:03 pm

ya, it\’s formijob@yahoo.com

Loading...

Reply
Chris says:

May 6, 2009 at 1:31 pm

BTW, the reason your query is failing is because it\’s referencing a session-scoped named set on the columns axis; I guess you either haven\’t already executed the CREATE SET statement needed for this set, or you haven\’t set RetainSameConnection.

Loading...

Reply
Unknown says:

May 15, 2009 at 11:21 am

Hi,Great blog with interesting informations. I can use it t solve my problem.ThanxM.http://www.vanjobb.hu/

Loading...

Reply
amey says:

December 7, 2009 at 9:15 pm

Chris, I know this blog is much older but is it possible that you have this integration services package with you?

Loading...

Reply
Chris says:

December 8, 2009 at 1:36 pm

Yes I do have it – you can contact me via my website http://www.crossjoin.co.uk

Loading...

Reply
1. Sanjeev Nayak says:
  
  February 11, 2014 at 9:15 pm
  
  Hi Chris, I want to implement cache warming in my current project, for that I have followed an approach wrote some mdx query using create cache command and executing those using ASCMD.exe utility and batch file. its working fine only if I am generating any report the performance is faster as expected but for any other user this cache warming is not effective. is there anything like this .bat file should be executed from service account/administrator privileged so that cache warming will be effective for all users.
  
  kindly suggest.
  
  Regards
  Sanjeev
  
  Loading...
  
  Reply
  1. Chris Webb says:
    
    February 11, 2014 at 9:20 pm
    
    Are all of your users members of different roles? If they are, and your cache warming is warming the formula engine cache, then that would explain why you’re the only user seeing a performance boost. *If* all of this is true then you have to run your cache warmer multiple times for all of your roles because the formula engine cache cannot be shared between users that are members of different roles.
    
    Loading...
  2. Sanjeev Nayak says:
    
    February 11, 2014 at 10:45 pm
    
    No Chris as of now all users are coming under a single role we have different data level access but we are handling all users through a role called business users. I am checking this cache warming effect with users those are having equivalent access with me.
    
    Loading...
  3. Chris Webb says:
    
    February 11, 2014 at 10:52 pm
    
    Then the next thing to check is whether the users’ queries are using features that are preventing them from using the formula engine cache. Do their queries have a WITH clause or do they use subselects?
    
    Loading...
  4. Sanjeev Nayak says:
    
    February 11, 2014 at 11:05 pm
    
    Hi Chris, The mdx which I am using that’s a simple cross join with time hierarchy , I am saving below mdx with a filename.mdx then executing using ascmd.exe.
    
    batch file script:
    ascmd -S leb-edwsrs02 -d “Lebara_DW_NEW” -i Queries\Cachewarm.mdx -o Output.xml -T Trace.csv
    
    MDX Script:
    
    CREATE CACHE FOR GBRAnalytics AS
    (
    {[Measures].[Actual Call Duration]}
    * {[Calendar].[YQMD Hierarchy].Members})
    
    Calling batch file from ssis package execute task, that’s it.
    
    Loading...
  5. Chris Webb says:
    
    February 11, 2014 at 11:24 pm
    
    Is the measure here a calculated measure or a regular measure?
    
    Loading...
  6. Sanjeev Nayak says:
    
    February 11, 2014 at 11:30 pm
    
    Chris, This is a regular measure.
    
    Loading...
  7. Chris Webb says:
    
    February 11, 2014 at 11:36 pm
    
    And to be clear, after you have run the Create Cache statement you run an MDX query it is fast, but when a member of a different role runs exactly the same query it is slow?
    
    Loading...
  8. Sanjeev Nayak says:
    
    February 11, 2014 at 11:39 pm
    
    After executing the create cache command query response is faster for me where as another user who is having role what I am having executing the query response is slow.
    
    Loading...
  9. Chris Webb says:
    
    February 11, 2014 at 11:42 pm
    
    You need to look at exactly what is happening at the query level. It could be that you are warming the cache for the queries you are running, but the queries your users are running are completely different and need different data loaded into the cache.
    
    Loading...
  10. Sanjeev Nayak says:
    
    February 11, 2014 at 11:48 pm
    
    Thanks Chris, I will have a look but I don’t think so, because after executing this create cache command from pivot excel just selecting this measure and time hierarchy into row level which is coming faster and the other user also doing exactly same thing in same sequence but the response is slow for him.
    
    Thanks for your time.
    
    Loading...
  11. Chris Webb says:
    
    February 12, 2014 at 9:57 am
    
    I would also recommend looking in Profiler to see what’s happening – are you seeing the same pattern of Storage Engine activity? Are you seeing Get Data From Cache events being fired in both cases?
    
    Loading...
  12. Sanjeev Nayak says:
    
    February 12, 2014 at 11:04 am
    
    Hi Chris, I found the problem of cache warming even if I am maintaining security through a single Role but we are restricting data to users using a mdx query which is based on a dimension attribute. If I am removing that mdx then cache warming is working fine for all users.
    
    Do you have any other approach of implementing security at data level so that my cache warming will work.
    
    Loading...
  13. Chris Webb says:
    
    February 12, 2014 at 11:08 am
    
    Do you mean that you are using dynamic security, or are all of your users connecting through another application that adds this attribute to all queries?
    
    Loading...
  14. Sanjeev Nayak says:
    
    February 12, 2014 at 10:15 pm
    
    Yes Chris, I am using dynamic security to restrict users which will have access to cube corresponding to some specific channel code and most of the users will have full access to cube. For them I created a new Role without any security, now Cache warming is working fine for them.For restricted users it will experience some slow performance. I think that should be fine.
    
    Thanks for your help.
    
    Regards
    Sanjeev
    
    Loading...
  15. Sanjeev Nayak says:
    
    February 27, 2014 at 3:08 pm
    
    Hi Chris,
    I am facing some problem in KPI report in ssas cube. As per requirement there are some measures identified as KPI measures whose performance will be validated Monthly against some GOAL value which will be set by the business based on certain dimension attributes. KPI report can be sliced by Time dimension and the dimension attribute for which GOAL have been set. For Example one KPI measure is
    
    SalesAmount, whose GOAL has been set for Product subgroup and month combination
    
    ProductSubG > XYZ Month> Jan 2014 > Goal > 100000 and productSubGroup is not directly linked to the sales Fact table from where sales amount is retrieving. Problem is when I am selecting KPI value, goal, trend etc along with time hierarchy its fine..but when slicing with productSubG attribute at individual level its showing proper value but at month level its showing SalesAmount for all product SubGroup.
    
    we have designed like this
    Fact Table > Sales_Fact which is linked to product Dimension through product_sr_key
    KPi Dimension > kpi_sr_key,MesureName(SalesAmount),product SubGroup(XYZ), monthyear(Jan 2014)
    KPI Fact > KPI SR Key (ref key of KPI dim), Goal Value (10000), MonthYear(Jan 2014)
    
    KPI_Fact >> linked >> KPI_DIM on KPI_SR_KEY)
    KPI DIM(MonthYear) >>linked >> dim Calendar(MonthYear)
    KPI DIM(ProductSubGroup)>>linked>> Dim Product (productSubGroup).
    
    Please suggest your thoughts, if this can be resolved..
    
    Regards
    Sanjeev
    
    Loading...
  16. Chris Webb says:
    
    February 27, 2014 at 3:30 pm
    
    I don’t fully understand what you are doing from your description, but it sounds like you might need to set the IgnoreUnrelatedDimensions property: http://www.bidn.com/blogs/adamjorgensen/ssas/467/ignoreunrelateddimensions-in-analysis-services-2008
    
    Loading...
  17. Sanjeev Nayak says:
    
    March 3, 2014 at 8:35 pm
    
    Hi Chris, I am facing a big problem in performance on my SSAS cube.One existing SSAS application currently client using its only having partitions, no aggregation nothing. where as my cube having partitions, aggregations but still its not performing how the old one is. I checked in profiler when I am browsing the old cube its extracting data from partitions (which almost having data volume wise same)still its faster compare to my cube where as my cube extracting data from aggregations some times also from partitions.
    
    Don’t understand whats the problem. Only difference is they are using c# code for partition creation and processing where as I am using XML script. They are using multiple cubes we have created a single cube.
    
    please provide some clue so it will help me to understand futher.
    
    Regards
    Sanjeev
    
    Loading...
  18. Chris Webb says:
    
    March 4, 2014 at 1:19 pm
    
    It’s hard to say what the problem is from your description. The first step to finding out what’s going on will be to look at a Profiler trace and find out whether the difference is in the Formula Engine or the Storage Engine – read the chapters linked to here to do this: http://cwebbbi.wordpress.com/2010/03/23/query-performance-tuning-chapter-from-%E2%80%9Cexpert-cube-development%E2%80%9D-available-online/
    
    Loading...
Andrew says:

April 12, 2010 at 10:14 am

Hi Chris, what if the textData column contains parameters.and would u please send me the ssis package ?Thankschenshx at live dot cn

Loading...

Reply
Unknown says:

June 22, 2010 at 7:09 pm

Hey Can u pls send the package at zurich008@hotmail.com…Thank you

Loading...

Reply
Charles says:

August 16, 2010 at 10:57 pm

can you also send me the ssis package please at: cafarr@godaddy.com…. Thank you Chris

Loading...

Reply
cody says:

October 25, 2010 at 9:35 pm

Would anyone know the reason I am recieving the following error after my cache warmer runs a few hundred queries.

“Server: The operation has been cancelled.”.
Possible failure reasons: Problems with the query,
“ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly.

Loading...

Reply
1. Chris Webb says:
  
  October 26, 2010 at 8:10 am
  
  Try running a Profiler trace while the cache warmer is running – do you see any errors from SSAS in there?
  
  Loading...
  
  Reply
  1. cody says:
    
    October 26, 2010 at 7:54 pm
    
    Thanks for responding…
    
    There are no errors in the trace. However when I clear the cache I will only process around 50 queries during the first run. The next time I run it it will make it to 100 and so on as cache is warmed. At some point around 500 queries it will not process further.
    
    It seems as though there is something being timed out.
    
    I can tell you I have deviated from your sol’n a bit by removing the looping through tables. Instead of having a Cache Warm DB I have a Cache Warm table. I am putting all of my queries into one table and running only the query building loop.
    
    Any thoughts?
    
    Loading...
  2. Chris Webb says:
    
    October 27, 2010 at 3:39 pm
    
    Hmmm, sounds strange. What does your memory usage look like on the SSAS box? Is there a memory leak somewhere inside the SSIS package?
    
    Loading...
Arun says:

November 2, 2010 at 7:45 pm

great stuff. Can you please e-mail me the package. i am not able to get it working.

Loading...

Reply
arun says:

November 2, 2010 at 9:38 pm

can you pelase send the package to arunbems@yahoo.com

Loading...

Reply
Dewasish says:

January 18, 2011 at 3:43 pm

Hi Chris, Could you please send me the package, i tried implementing it but couldn’t get it working.
My email id: kr.dewasish@gmail.com

Thanks.

Loading...

Reply
1. Chris Webb says:
  
  January 18, 2011 at 9:11 pm
  
  I’ll try to dig it up, but to be honest I prefer the approach described here instead now:
  http://www.sqlis.com/post/SSAS-Cache-Warming-Using-SSIS.aspx
  
  Loading...
  
  Reply
Anthony Peccerillo says:

May 4, 2011 at 4:08 am

This is an excellent post and works great for queries that return results in a few seconds. I have a query that runs in SQL Management Studio in 1:50 sec {You can thank my Parent Child dimension for that}. When I run it via SSIS I get this error “[Execute SQL Task] Error: Executing the query “WITH MEMBER [Aggregate].[Aggregate].[XL_SD0] AS ‘A…” failed with the following error: “XML for Analysis parser: The XML for Analysis request timed out before it was completed.”. Possible failure reasons: Problems with the query, “ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly.” The TimeOut on the Execute SQL Task is set to 0. Is there another Timeout setting that I am missing ?

Loading...

Reply
1. Anthony Peccerillo says:
  
  May 4, 2011 at 5:12 am
  
  I forgot to mention in my earlier post that the TimeOut property of the ADO connection manager for my Analysis Services 2008 DB is set to zero as well
  
  Loading...
  
  Reply
Pingback: Performance tips for SSAS | James Serra's Blog
Pingback: Building multiple caches for PerformancePoint « BI ChopSuey
Ayca says:

May 7, 2012 at 4:24 pm

Hi Chris, Can you please email your SSIS package to hedelericin@gmail.com ?thanks

Loading...

Reply
Mehmet says:

June 22, 2012 at 3:57 pm

Great but My package is failing when work with big data

Loading...

Reply
hxy0135 says:

October 24, 2012 at 8:13 pm

Hi Chris,
Great article. Could you please email your SSIS package to hxy0135@hotmail.com?

Thank you,

Loading...

Reply
Pingback: SSAS : Cube Design and Performance Tuning | Romain Casteres
Ehtesham Ansari says:

October 1, 2014 at 9:21 am

Hi chris, wonderful article. helped me a lot. but what if the textData column contains parameters.and would u please send me the ssis package ? my email id is ehtesham.niet@gmail.com

Loading...

Reply
Pingback: Consolidata – Speeding up SSAS Cube Browsing in SQL Server Management Studio