Storage and the NameColumn and KeyColumns Properties

Those of you who have worked with SSAS Multidimensional for a reasonable amount time will, no doubt, be very familiar with the NameColumn and KeyColumns properties of an attribute (if you’re not, see here and here) and how they should be used. You will probably also know that when the KeyColumns property has been set to only one column, then the NameColumn property can be left unset and the key will be used as the name of the attribute’s members.

However, while onsite with a customer recently I noticed something strange. Here’s an illustration: if you create a simple dimension based on the DimCustomer table in Adventure Works and create a single attribute based on the CustomerKey column, this is what you see in the Properties for that attribute:

image

However, if you deploy the database then import it into a new project in SSDT/BIDS, then you see that the NameColumn property has been set:

My first thought was that this was a bit dangerous, because it might mean that the imported version of the database would start storing extra strings for the names. But this was incorrect because a look at the data directories for the two versions of the dimensions showed they contained the same files and were using the same amount of storage:

I’m very grateful to Akshai Mirchandani of the dev team for confirming that in this situation, it is irrelevant whether you set the NameColumn or not – data duplication will always happen, and the key values will be stored again as strings. The only time it doesn’t happen is when the key and the name of the attribute are both the same column and that column is a string.

This means that if you have a very large attribute that is in danger of exceeding the infamous 4GB limit (although this is of course fixed in SSAS 2012) and which never needs to be visible, you can use the trick that Greg Galloway describes here to reduce the size of the string store. This involves creating a dummy column in your DSV (or underlying view or table) that contains only an empty string and then setting this as the NameColumn of your attribute. For the example above, this is the result:

For this example, the overall amount of storage used for the dimension has gone down from 1.24MB to 1.04MB, and although you can see the .ahstore file (the hash store) for the Customer Key attribute have grown, the size of the string store, Customer Key.asstore (note: don’t get confused between .asstore and .astore files), has reduced from 362KB to 1KB.

5 thoughts on “Storage and the NameColumn and KeyColumns Properties

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      That’s a good question. The main reason I wrote this up is that this is an important way of avoiding the 4GB limit, but I need to do some more research into whether there are other processing or querying performance benefits – I would imagine there are.

  1. Dear Chris,

    Although this blog post was published more than 10 years ago, it is still highly relevant to me.

    In your example, will the problem (data storage) be solved if I change the NameColumn.DataType property from WChar to Integer help?

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      I don’t know I’m afraid, and I’m not even sure if anyone inside Microsoft will know either without spending a lot of time searching through the codebase. Your best bet is to test this yourself .

Leave a Reply to Chris WebbCancel reply