If I Could Have New Features In SSAS Multidimensional, What Would They Be?

Indulge me for a moment, please. Let’s imagine that somewhere in Microsoft, someone is planning for SQL Server v.next and is considering investing in new features for SSAS Multidimensional (don’t laugh – I wouldn’t be writing this post if I didn’t think it was a possibility). What features should they be?

Before I answer that question, it’s worth pointing out that despite what you might think there has been some investment in SSAS Multidimensional over the last few years. This post lists what was new in SSAS 2012 Multidimensional; since then support for DAX queries has been added and, umm, the new Divide() function. This must have been a lot of work for someone – but why does it get overlooked? One reason: none of these changes have made much difference to the ordinary SSAS Multidimensional developer’s life. DAX query support is great if you’re one of the few people that uses the SharePoint version of Power View; shockingly, it still doesn’t work in Excel 2013 Power View yet (though I guess it will be the way the new Power BI connects to on-prem Multidimensional). NUMA support is great if you work for an investment bank and have vast amounts of data and a high-spec server, but that’s only about 0.1% of the installed base.

So from this we can learn that the main consideration when choosing new features to implement should be that they should be relevant to the majority of SSAS Multidimensional developers, otherwise they’ll be ignored and MS may as well have not bothered doing anything. To that we can add these other considerations:

  • These features should provide compelling reasons to upgrade from earlier versions of SSAS to the new version
  • While some features should be available in all editions, there should also be some features that encourage customers to upgrade from Standard Edition to Enterprise Edition
  • There are a limited resources (time and developers) available and Power Pivot/SSAS Tabular will be the priority, so only a few features can be delivered.
  • Features that are only there to support Power BI don’t count

With all of that borne in mind, here’s what I would choose to implement based on what I see as a consultant and from the popularity of particular topics on my blog.

Last-Ever Non Empty

One of the most popular posts I’ve ever written – by a gigantic margin – is this one on the last-ever non-empty problem. Given that so many people seem to come up against this, and that the MDX solution is complex and still doesn’t perform brilliantly, I think it should be built into the engine as a new semi-additive aggregation type. Since semi-additive measures are Enterprise Edition only, this would be my sole Enterprise Edition feature.

MDX Calculation Parallelism

Ever since I’ve been working with SSAS, people have always asked why the Formula Engine has been single-threaded. I understand why the SSAS dev team have ignored this question and instead concentrated on tuning specific scenarios: doing parallelism properly would be extremely difficult given the way MDX calculations can be layered over each other, and in plenty of cases it could lead to worse performance, not better. However I’m not asking for a ‘proper’ implementation of parallelism. I just want something dumb: a boolean property that you can set on a calculation that tells the Formula Engine to do this calculation on a separate thread. If it makes performance better then great; if not, then don’t set it. My guess is that even a crude implementation like this could make a gigantic difference to performance on many calculation-heavy cubes.

Drillthrough

Drillthrough is one of those features that almost everyone wants to use, but for some reason has been left in a semi-broken state ever since 2005. Here’s what needs to change:

  • It should work with calculated members. I don’t expect SSAS to understand magically how to work out which rows to display for any given MDX calculation, but I would like a way of specifying in MDX what those rows should be.
  • Those stupid, ugly column names – SSDT should let us specify readable column names and let us have complete control over the order they appear in.
  • Excel should allow drillthrough on multiselect filters.

‘Between’ Relationships

This might seem a bit of a strange choice, and I suspect it may not be easy to implement, but another problem that I come across a lot in my consultancy is the ‘events-in-progress’ problem. I’ve blogged about solving it in MDX and DAX, as have many others. I would love to see a new ‘between’ dimension/measure group relationship type to solve this. In fact, competing OLAP vendor iccube already implemented this and you can see how it works on that platform here and here. My feeling is that this would open up a massive number of modelling opportunities, almost as many as many-to-many relationships.

 

And that’s it, four features that I think could make SSAS Multidimensional v.next a must-have upgrade. I’m not so naive to believe that any or all of these will be implemented, or even that we’ll get any new features at all, but who knows? If you have any other suggestions, please leave a comment.

46 thoughts on “If I Could Have New Features In SSAS Multidimensional, What Would They Be?

  1. +1 for allowing drillthrough on calculated members where we can specify what the drillthrough should return. It would also be nice to allow multiple sets to be returned on a drillthrough, for example on a ratio that sources data from different fact tables.

    Between Relationships – Yes… may that day come to pass!! It is extremely beneficial to the events in progress problem as you’ve mentioned – a problem which arises very often in the real world.

    Calculation Parallelism – +1 million.. it is very archaic for a calculation engine these days to be single threaded!!

  2. My main concern as BI Developer is MDX performance, I’m often forced to implement things in SQL because I know it will be slower if I implement it in MDX.

    Another pain point IS that Microsoft completely dropped the ball when it come to client tools. Most of their new offerings only work with tabular models which no one uses.

    Other useful features from a developer perspective would be

    1. Ability to group calculations into “folders” so that you’re not scrolling up down through hundreds of calculated measures.

    2. Treat named sets as first class citizens. Currently we can’t use a named set as a default member in an attribute and we can’t use them as a filter in Excel

    Hope that someone at Microsoft is reading this

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Raf, you are able to group calculations into folders using the Display_Folder property of a calculated member: https://msdn.microsoft.com/en-us/library/ms144787.aspx This has been possible since 2008 R2 at least, and even before that there was a button on the toolbar in the Calculations tab that did the same thing.

  3. Couldn’t disagree with anything on your list, but the single biggest feature for many of my customers would be a more optimised cache. SSAS’ caching could be greatly improved to hold more of the cube in memory. Currently, it relies too much on the Windows file cache. There is an active Connect suggestion on https://connect.microsoft.com/SQLServer/feedback/details/760107/multidimensional-cube-is-not-retained-in-memory-even-though-there-is-plenty-of-ram.

  4. a more granular visual total control! like having the ALL member with the real totals, while the rest of the dimension will be in visual total mode.

    the between for sure!!! I have a lot of cases with start and end dates, and the users needs how many persons or events for a specific or a range of period. (generally a distinct count measure); I’m doing this with a huge (1 billion rows) many-many fact table, not beautiful but the speed is correct.

  5. Speaking of drill to detail actions. how about having the ability to “re order” the attributes(columns) in the action? If you have a Date dimension and want to have the first column Day, the second Week and the third month, you cannot rearrange their order. In fact, the order that shows up is the “order in which you originally added the attribute to the dimension. Cannot believe that this has not been added over the years.

  6. Great list so far and my favorites here are parallel mdx calculations and drillthrough on calculated members.

    Also, I’d like to add:
    – option to compute HOLAP aggregates on the database (in SQL) instead of pulling all detail data to SSAS and do to aggregation there.
    – optimized ROLAP query generation (especially for APS).
    – translations: allow “order by name” for translated data (currently translations are always ordered by the original language if you choose ‘order by name’)
    – handling of distinct counts is not straight forward (separate measure group). Improvements would be very welcome.
    – option for easier many-to-many relations (maybe like in Power BI Designer Preview)

  7. gbrueckl – Austria – Gerhard has been implementing traditional BI and data warehouse solutions based on the Microsoft Data Platform for over 10 years and has been sharing his knowledge with the community ever since. With the rise of “Big Data” over the last years he also shifted his interests towards new technologies, mainly related to the Hadoop eco system to fight the masses of data he frequently encounters at his customers. Having this strong background with the Microsoft Data Platform, he always tries to combine this “old” and the “new” world which still have a lot of things in common!
    gbrueckl says:

    SSAS Multidimensional scales pretty well with fact-data but it is a real pain if you work with big dimensions (lets say 1M+ members and/or 100+ attributes)

    the ProcessingOption for Dimension “ProcessByTable” throws errors even though “ProcessByAttribute” works just fine, even for small dimensions where available memory is not the problem. If you have enough memory and use “ProcessByTable” it should just work the same way as “ProcessByAttribute” and not throw any errors
    This would avoid the costly SELECT DISTINCT queries on the relational engine for each attribute

    an automated process that generates fact-partitions based on existing dimension-data
    lets say “Partition by Month” and SSAS checks the Month-Attribute and automatically creates a new partition for each month

    proper security on calculated members

    a multi-threaded FE would probably have the biggest impact on existing mutlidimensional cubes
    it would be enough if each calculation (e.g. each calculated member) would run in a separate thread so if you query contains 3 of them it would be spread across 3 threads

    better support for DAX
    well, DAXMD is fine and basically works pretty well. However it does not work with any of the new Power BI features being PowerView in Excel 2013/365, SSAS on-prem Connector for Power BI, Power Query (as the result is a table it would be better to use DAX instead of MDX to query a cube)

    im sure i would find more but thats just what came to my mind right now 🙂

  8. 1. Support for claims based authentication using standard SAML tokens, no C2WTS.
    2. +1 for parallel FE
    3. Improve the performance of distinct counts. Blazingly fast in Tabular , but unusable for many critical apps with UDM.
    4. Improve performance of many-to-many joins.
    Thanks for this and many other great posts!

  9. About a multi-threaded FE, a small company called Iccube implemented it on their olap server, a lot of customer would love it , why is it so difficult for MS to do it ?

    in-memory molap and better management for huge dimensions…

    built-in predictive functions

    and then why not dream with spatial MDX ; give me my top 100 customers at 10 min drivetime from my POS…okay it is possible now but it is the headache !

  10. 1) Display order property for measures to sort measures other than alphabetically.
    2) In SSMS cube browser, use formatted values for measures, like it used to – why was this dropped in 2012?

  11. Good wish-list.

    Let’s start by saying … even if AS Multidimensional is 15+ years old, the technology still rocks and is still the real deal. There are hundreds of enterprise reporting tools and millions of multidimensional cubes out there that use SSAS Multi.
    My feature wish-list would be:
    – ROLAP queries improvements please, the future is probably in aggregated clustered columnstore indexes.
    – caching improvements as requested above
    – A new in-memory storage mode, leveraging vertipaq
    – LinkMember performance and usability improvements, or in other words relationships overrides. IT should be much easier than it currently is. Put on-the-fly many to many relationships in that same bucket.
    – dimension = measure group so we can also partition dimensions (similar to Tabular)

  12. my attention got caught up on this: “There are a limited resources (time and developers) available and Power Pivot/SSAS Tabular will be the priority, so only a few features can be delivered.” ….

    that passes a pretty much strong message… & why I would prefer to *gradually* gravitate to a single analytics engine bridging the best of both. Focus, faster development and deliveries &relevant breakthroughs.

    After working with both models (much more on MD still, years invested in it…. blood sweat and tears…tears being of joy! :)) I have to confess the tabular model seems to be more natural for users, no “metrics”,”dimensions”, you can count/min/max/avg/distinct on the fly. You can count/distinct count customers, you don’t need a “customer’s measure group”, you can relate tables freely (you don’t need them to be measure groups/dimensions)

    Having said that…! if I had to choose my top 10 features for new md release (or for the moment, what I can think of…):

    1-*on time* support for latest frontends, keep up with tabular, we can’t wait months/years every time new frontend feature become available (excel powerview….new powerbi…)

    2-on the fly aggregations like tabular, only model the default aggregation, possible? engine has not built to this originally …including on the fly calcs over dimension attributes (there I go to the tabular side of it…)

    3-+99999 to the FE parallel fix, stumbled on this a few times…to the point of having to hack workarounds, really hard to accept these days

    4-drilltrough, agree with Chris notes

    5-dimension relations/auto filters – common for multicompany dimensions, very frequently, dimensions are related

    6-native full text search indexes, security trimmed, usable in Excel/powerview/everywhere (QA already partially uses this)…. am I “cannibalizing” some well-known product features? 🙂 Yes, we got plenty of other cool ones to build! 🙂

    7-native olap writeback partition, where we can write tuples: coordinate, value, formatted value, etc (enable flexible forecast & targets for any metric, work with excel writeback natively, budget/what if on pivots, even overwrite regular measures, any cell, persist on a generic fact table tuple | value | formatted value | date | user |etc… )

    8-virtual cubes – several cubes instead of one is a known practice (not my personal “default” I must say…) a kind of more decoupled “virtual cubes” would be great, linked measure groups have too many issues, (sql 2000 had similar concept of virtual cubes )

    9-improved rolap for columnstore indexes

    10-native in memory storage/mode, picked from Richard Lees notes

    (noting there’s a lot of diversity in feature requests, won’t be an easy task to prioritize really… effort for each one will be key, but the post should gets things going 🙂 thanks Chris!)

  13. Server Side:
    1) Ability to control which dimensions are used in joining with a Many to Many relationship. Example, adding a new dimension to that will act as a many to many and setting another dimension in the group to be ignored when applying the query.

    Client Side (Excel)
    1) Attribute Search
    2) Display of the Description Column for internal Field Help.
    3) Collapse of all Measures and Attributes by default.

  14. I do not think that “Parallelism in MDX calculations” is realistic. That would require a lot of effort for a feature that might work. I agree with the rest of your suggestions Chris. One of mine is to optimize the RANK() function and add Percentiles functions with good performance.

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Yes, that’s a good point – Rank() and Order() both desperately need optimising, and we do need a Percentiles function.

  15. All good ideas Chris & I’ve also come across scenarios where those features would have been handy…
    I couldn’t find them on MS-Connect, otherwise I’d have voted for them (hint, hint 🙂

  16. This would be my list:
    -Aggregations for intermediate levels in parent-child dimensions
    -Partitioned dimensions
    -FE parallelism
    -Query execution plans

    The drillthrough section doesn’t cause that much pain actually. First two points you bring up can be solved by ASSP. The third one (multiselect) is a pain, but at least with Excel 2010 the problem was that Excel didn’t even ask for custom actions when multiselect was active. So more like a client, not SSAS problem 😉

  17. Analysis Server multidimensional is still used a lot, and it should still be improved.
    These are my suggestions.

    Analysis Server improvements:
    1) Parallelism in formula engine mdx queries. This would greatly speed up many customer scenarios. Chris suggestion would be great! We need to squeeze every little bit of performance, as the data amount in AS is always increasing…
    2) Allow third party application to write many values into the olap cache in a very fast way.
    3) Allow the clearcache xmla statement to work on single olap partitions, instead of measuregroups
    4) UPDATECUBE improvements:
    4a) it should work even when there is a CREATE SUBCUBE statement active on the cube, in such a way that the update statement only updates the tuples in the subcube scope.
    4b) it should work with many to many dimensions
    4c) it should be able to write null values. There is no need to commit them, as a third party application could very easily delete the required record from the SQL relational database. Only the AS cache should be updated…
    5) Implement an UPDATE mdx statement even for standard dimensions (not parent child): The UPDATE on dimension should have the following features
    5a) insert/update tuples in the AS cache of the dimension
    5b) not require a dimension process
    5c) enforce the BEGIN/ROLLBACK transaction
    5d) pushing the inserted/updated data to the relational database is not required, as a third party application could easily implement it
    6) Implement a way to remove inline mdx scope assignments. Let’s say that on a adomd connection a third party application applies an inline mdx scope statement to a tuple. There is no mdx statement to remove it, unless a Subcube existed before the inline statement was applied and the subcube is dropped.
    7) Make mdx subqueries faster. Their performance has improved in recent AS releases, but in many situations they are still quite slow
    8) Make recursive calculations faster. There are many very complex formulas that cause a very big performance hit. A typical problem is something like this: tell me the number of weeks of stock coverage with the stock levels, planned sales, etc
    9) The partition slice should be used even if partitioning is at the measure level. Let’s say that an olap measure group contains two measures. These measure, for external constraints, have to be put in two different olap partitions, so that each partition contains all the data for each measure. It should be possible to set a measure partition slice on each partition, so that the mdx query executor, when querying a only one of the measures, ignores the other olap partition
    10) When deploying via AMO, check if the a user associated to a role really exists in Active directory. It is not nice when a commit fails with an internal error because a user has been removed from Active Directory…
    11) Semiadditive aggregations on non time dimension. This could be generalized as the ability to choose a specific aggregation rule for each dimension used by a measure

    Client side (Excel) optimizations:
    1) Make Excel slicers faster. In many situations they execute thousands (!!) of mdx queries on data and metadata. Very nice feature, but mostly useless in many customer scenarios…
    2) Allow a third party application to set/change the Excel slicers connection, or make so that they share the excel pivot connection
    3) Allow a third party application to override the language of the excel slicer connection
    4) Make Excel slicers work when there is an active CREATE SUBCUBE statement

  18. Thought of a new one this morning as I had the bright idea to start hacking away at a cube calculation script before the first cup of coffee…

    Better scripting environment for cube calculation script:
    – intellisense (doesn’t need to be as nice as .NET in visual studio but at least as good as SSMS
    – more informative error messages

    …now for that first cup of coffee :-p

  19. First let me say that I am very happy for Microsoft’s renewed interest in OLAP. I was beginning to think they lost all the developers on the team that built their multidimensional product.

    There is a performance issue that has been at the top of my list (for like 10 years now). Simply drilling down into an unnatural (“user”) hierarchy is MUCH slower than creating a cross-join query for the EXACT SAME attributes. Cardinalities just need to be above 100 or so. How silly is that…

    https://connect.microsoft.com/SQLServer/feedback/details/349472/unnatural-hierarchies-perform-worse-than-expected
    http://support.microsoft.com/kb/2131988

  20. One thing that is well overdue is improving the developer experience in Visual Studio. There are a lot of obvious things that are currently missing to improve developer productivity and reduce errors. Much of my time in Visual Studio is spent replaying mental checklists in my head to avoid known bugs that either happen in the IDE or are produced unintentionally by the IDE if you aren’t being careful. (eg. bugs in DSV designers, maintenance of calculated measures that are common across several cubes, maintenance of linked measure groups, etc.).

    During the time Microsoft has been sleeping (for the past five years) a whole separate tool has been created independently to fill in a lot of gaps in the areas that improve development experiences. Its called the “BIDS helper”. The biggest problem with the “BIDS helper” is that it is unsupported and has to be installed separately.

    https://visualstudiogallery.msdn.microsoft.com/13e3ec4a-b0b8-4407-9f01-4deff161f381

    In order to bring the developer experience back inline with expectations, Microsoft could just fire up a copy of BIDS helper and see what they were missing…

  21. Chris,

    I have been in healthcare for 20 years and will be entering investment finance in the future. Your statement: “NUMA support is great if you work for an investment bank and have vast amounts of data and a high-spec server, but that’s only about 0.1% of the installed base.”

    …could you expound on this? Is this statement made in the context of sheer data volume or to the heavy calculations seen in the finance arena. Along those lines, do you ever see clients using Server ADOMD.NET to wring out any extra performance gains?

    Excellent job as usual.

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Hi Steve,

      When I made that statement about NUMA I was thinking more about data volumes rather than calculation complexity. What do you mean by using ADOMD.NET to get performance benefits?

      Chris

      1. Nothing major……In general it is my understanding that using .NET CLR sprocs and UDFs outperforms SQL functions when it comes to string parsing and mathematical computations in an RDBMS. I was wondering if that axiom also applied to the multidimensional side of things as well using SERVER ADOMD.NET stored procedure as opposed to an in-the-box MDX function……for a complex mathematical computation within a multidimensional cube.

      2. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        As always, it depends. For a few scenarios using .NET code can give you performance improvements, but in most cases .NET custom MDX functions perform really badly and should be avoided.

  22. I haven’t read all the suggestions, sorry if I repeat something …
    1. Solve the “arbitrary shape” problem; since Excel is very, very, very bad in dealing with that and it’s quite impossible to convince users to use anything else, please solve it on the SSAS side
    2.Everything scriptable and doable in VS: it’s a nightmare to be able to do some things in SSMS and others in VS, and it’s a nightmare for a DBA to handle the life cycle of a SSAS project
    3. Better performances of the Many-to-Many joins
    4. The handling of the aggregations is quite reasonable with BIDS, however there are improvement areas (when you partition a measure group by some (level of a) dimension member, BIDS should design the aggregations accordingly to that
    For what regards a parallel FE Cristian told me in the early 2006 why, being a CPU bound process, it’s not useful to have that.

  23. 5. In SSMS: some evidence on if a query is using “block computation” or not, and possibly some hint to have it; it could be understood in MDX Studio, bur since Mosha left nobody maintained the tool
    6. In VS: some graphical tool to handle the calculated members dependencies (it was on MDX Studio)
    7. Perspectives as a security mechanism

  24. A must feature is to allow for multiple parent/child hierarchies on a dimension. Don’t you find this to be a major draw back?

    SAP HANA as a source will also be great.

Leave a Reply to thomasivarssonmalmoCancel reply