What happens when you do a Process Update on a dimension?

Chris Webb Analysis Services May 12, 2010September 13, 2010 2 Minutes

Over the last few days I’ve been involved in an interesting thread on the SSAS forum regarding what happens when you do a Process Update on a dimension. It’s a topic that is not widely understood, and indeed I’ve not known all the details until today, but it’s nonetheless very important: one of the commonest performance-related problems I see in my consultancy work is partitions that have aggregations designed for them, but where those aggregations aren’t in a processed state because a Process Update has been run on one or more dimensions. Anyway, just now Akshai Mirchandani from the dev team posted a really good overview of what actually happens when you run a Process Update on that thread, so I thought I’d copy here to ensure it gets a wider audience:

Here is a quick summary of what happens when you do ProcessUpdate:

1. After the dimension has been updated, the server analyzes the changes that occurred to the dimension. In 2005, this analysis was pretty simple and would often incorrectly detect that major changes had occurred that required clearing of indexes and aggregations. In 2008, this code was improved such that it more often would realize that nothing significant has occurred. It’s a fairly small (but useful) optimization — I guess nobody thought it was worth documenting!

2. Based on this analysis, the server will decide whether or not indexes and aggregations need to be cleared. If no (e.g. because records were only added and not deleted/updated), then the partitions won’t be affected.

3. If indexes/aggregations need to be cleared, then the server will check if ProcessAffectedObjects was enabled — if yes, then instead of clearing the indexes/aggregations it will rebuild the indexes/aggregations.

4. The act of clearing the indexes/aggregations also shows up as "partition processing operations" in Profiler — that’s one of the things that has been confusing some of you.

5. When aggregations are cleared, only the flexible aggregations need to be cleared because we’re guaranteed by the rigid relationships that the members cannot have moved and therefore the rollups cannot have changed. However, indexes can still have changed and therefore you may still see the partition processing jobs kick off to clear the indexes.

6. ProcessIndexes and ProcessClearIndexes take care of building both bitmap indexes (aka map) and aggregations — the context is that both aggregations and bitmap indexes are generically considered "indexes".

Really the main takeaway here is that if you ProcessUpdate a dimension, you should strongly consider either doing ProcessAffectedObjects or an explicit ProcessIndexes on the affected partitions so that bitmap indexes and flexible aggregations get rebuilt. The advantage of explicitly doing ProcessIndexes is that you can bring your cube online earlier and have the indexes/aggregations get processed more lazily in the background — a number of customers prefer to do that because their processing windows are too small to wait for the indexes to get processed.

Also related to this topic, I thought I’d also highlight a great post by Darren Gosbell where he shows how to find out if your aggregations are processed or not:

http://geekswithblogs.net/darrengosbell/archive/2008/12/02/ssas-are-my-aggregations-processed.aspx

Published by Chris Webb

My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel. View all posts by Chris Webb

Published May 12, 2010September 13, 2010

20 thoughts on “What happens when you do a Process Update on a dimension?”

Hassan says:

May 13, 2010 at 8:31 am

Hi I have deployed PPS2007 in a test lab. All went well and without issues. From the Administration Console I configured everything without issues. However, when I went to Excel and tried to use the Report Wizard to create a report, I am asked to select the required model which I do without issues. Then I click Next, where I am asked to select the dimension and Hierarchy which I select OK. Then when I click the memeber selection field I get an error as follows: "Error getting the database context for the model site[2035d974-eeea-4143-9231-7d76089fb68f]can someone help please?regards

Reply
Chris says:

May 13, 2010 at 11:19 am

Hi Hassan,This isn\’t really the right place for this question. Try posting on http://social.technet.microsoft.com/Forums/en-US/ppsmonitoringandanalytics/threadsChris

Reply
Dmitriy Tataurov says:

May 12, 2014 at 3:08 pm

Hi, Chris. I can not understand what will happen if ProcessUpdate mode selected when a fact table contains missed dimension members.

Reply
1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
  
  Chris Webb says:
  
  May 12, 2014 at 7:22 pm
  
  Can you explain what you mean exactly?
  
  Reply
Christopher Ross – Sydney, Australia – I'm simply a problem solver. and Somehow that led me to the wonderful world of BI. One of the greatest things in life is to equip others for success, and that's just what a good BI consultant does. We provide organisations with the tools to turn the raw power of data in to the harnessed energy of information. If the people and businesses I serve can see that I've been a part of helping them make great decisions, save time and make money then my days are made!

Chris Ross says:

September 3, 2014 at 2:25 am

Would setting the processing mode to lazyaggregations for measure groups related to a dimension that gets the processupdate command be helpful?

Reply
1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
  
  Chris Webb says:
  
  September 3, 2014 at 8:20 am
  
  It would mean that there’s one less thing to have to do explicitly, but in general I don’t like the idea of lazy aggregations – I like to be in control of what’s happening and I don’t like the idea of unpredictable performance.
  
  Reply
  1. Christopher Ross – Sydney, Australia – I'm simply a problem solver. and Somehow that led me to the wonderful world of BI. One of the greatest things in life is to equip others for success, and that's just what a good BI consultant does. We provide organisations with the tools to turn the raw power of data in to the harnessed energy of information. If the people and businesses I serve can see that I've been a part of helping them make great decisions, save time and make money then my days are made!
    
    Christopher Ross says:
    
    September 3, 2014 at 8:22 am
    
    I can relate, thx
Greg Coopman – Independent SQL Server Business Intelligence Contractor from Dallas, Texas. Go to http://www.sqlcoop.com for more information.

sqlcoop says:

November 12, 2014 at 10:41 pm

Can you run Process Cube Default to rebuild any of the aggregations or indexes that were dropped as a result of the dimension update process?

Reply
1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
  
  Chris Webb says:
  
  November 14, 2014 at 12:30 pm
  
  Yes, you can.
  
  Reply
David Laplante

David Laplante says:

March 3, 2015 at 3:31 pm

Hi Chris,

First of all, thank you for all your very helpfull articles! Yours as well as Marco Russo’s and many other’s have been life savers many many times for me.

I have a question regarding specifically the subject of this post.

I have a cube with 3 measure groups, all partitioned by a “snapshot” dimension.
My fact data is versionned and so have EFFECTIVE_DATE and EXPIRY_DATE.
Snapshots are set dates of effectivity for my fact data. This way the user can compare what was effective today (“CurrentSnapshot”) versus a week ago (“CurrentSnapshot-7”)

The 3 measure groups are Facts for “Reservation Sales” (really an aggregated view of detailed sales), “Reservation Item Sales” (Sales for individual Reservation Items) and “Reservation Detailed Sales” (Sales for each reservation – no Reservation Item Breakdown – by type of sales: Airport Taxes, Commission, etc)

Partitions on snapshots are made so that I have x amount of “Moving Snapshots”: CurrentSnapshot (today), CurrentSnapshot-1 (yesterday), CurrentSnapshot-7(last week) and so on.

I also have “fixed” snapshots which contain fact data that is effective at the 1st of each month. It is concerning what is happening to these partitions that I have questions.

The data in these fixed partitions never change and so we do not process them in our SSIS Script (based on one of your post on processing through Script Task and AMO 🙂 )

However, even if we do not process these partitions, they still get “processed” in some form or another. I’m guessing they have their indexes dropped since the time taken for each fixed partition is small (between 5 and 15 seconds). The thing is that we now have 22 fixed partitions x 3 measure groups making 66 partitions. If each of these take 10 seconds, we easily have over 10 minutes of processing on these partitions.

Key members for each dimension related to these measure groups do not change and do not get deleted. However, it is very possible that another attribute in the dimension gets changed (All dimensions are SCD1).

And my question 🙂
Since the key member attributes do not change and do not get deleted, aren’t the fixed partitions supposed to stay exactly the same when we “ProcessUpdate” the cube’s dimensions? Are are the indexes being dropped (assuming that this is what’s happenning)? Or is it normal that ALL partitions get touched when a process update occurs on a dimension ?

Tanks in advance for taking the time to reply! Let me know if you need more details

Reply
1. David Laplante
  
  David Laplante says:
  
  March 3, 2015 at 5:26 pm
  
  I forgot to mention that the fixed partitions do not have aggregations defined.
  
  Reply
  1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
    
    Chris Webb says:
    
    March 4, 2015 at 9:13 am
    
    If there are no aggregations, then it must be the indexes of the attributes where data has changed that are being rebuilt. Some things to try to improve processing performance would be to set the RelationshipType property of your attribute relationships to Rigid where appropriate, setting the AttributeHierarchyOptimizedState to NotOptimized on attributes that won’t be used for querying often (or even better, setting AttributeHierarchyEnabled to false or even deleting the attributes), and everything Andrew Calvett recommends here:
    http://blog.calvett.co.uk/2013/02/09/the-anatomy-of-a-process-update-part-1/
    http://blog.calvett.co.uk/2013/02/21/the-anatomy-of-a-process-update-part-2/
  2. David Laplante
    
    David Laplante says:
    
    March 4, 2015 at 5:43 pm
    
    *Curiously, I could not reply to your reply so I am doing it here*
    
    Thanks! That helps alot!
    
    We have dimensions that come from data in MDS and I know that one user is changing some attributes in a few entities and eventually those are brought back to our dimensions so this makes a lot of sense to me now. I was Under the impression that indexes were only for leaf levels attributes and then any index for rollups were managed from inside the dimensions. I never really looked at all the internal files for SSAS. I see that I have a lot of .map files (81 000) so as I understand it now, I have a great potential of having to recalculate the indexes. Especially since the dimensions are SCD1.
    
    I have also come upon the “Delete Unused Indexes” feature from BIDS Helper that I never gave much attention to. I am running a SSSAS Query Trace for a week and then I will run it to see what that wizard suggests I set NotOptimized.
    
    Thanks again for replying so quickly!
mag_bic (@mag_bic) says:

March 18, 2016 at 9:32 am

Hi Chriss, do you recommand doing proces index on dimension after dimension update?

Reply
1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
  
  Chris Webb says:
  
  March 19, 2016 at 3:39 pm
  
  No, there’s no need to do a Process Index on a dimension after a Process Update.
  
  Reply
Thiyag says:

October 20, 2016 at 12:09 pm

Hi Chris,
For processing dimensions and Fact data we have steps in Sql Server Agent Job.
1) First Dimensions are processed with “Process Update”
2) Next Step is processing only one particular partition with “Process Full” option. Though we have more than 5 partitions only one is processed.
So my doubt here is whether the Indexes will be recreated when Partition is processed with “Process Full” option.

Currently we are facing Performance issues when cube is queried by users.
So could you please suggest whether the processing is fine or do we need to change the processing here?
Many Thanks in advance

Reply
1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
  
  Chris Webb says:
  
  October 22, 2016 at 5:49 pm
  
  Hi Thiyag, indexes are created when you do a Process Full. However doing a Process Update on a dimension may also delete indexes and aggregations on other partitions, not just the one you are processing. You need to add a third step here and do a Process Default on the entire cube to make sure all indexes and aggregations are built.
  
  Reply
  1. Vinod says:
    
    December 4, 2016 at 1:51 pm
    
    Hi Chris,
    So it should be like process update on dimensions
    Processfull on the partition of measure group
    ProcessDefault on the entire cube?
  2. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
    
    Chris Webb says:
    
    December 6, 2016 at 8:03 am
    
    I recommend a ProcesDefault on the entire database, not just the cube.
Pingback: SSAS Multidimensional: Are Your Indexes Processed? – Chris Webb's BI Blog

Share this:

Published by Chris Webb

20 thoughts on “What happens when you do a Process Update on a dimension?”

Leave a ReplyCancel reply

Discover more from Chris Webb's BI Blog