PDW and SSAS

One new feature of SQL Server PDW 2012 that hasn’t had the attention it deserves is the fact that it is now officially supported as a data source for Analysis Services, both Multidimensional (in ROLAP and MOLAP modes) and Tabular (in In-Memory and DirectQuery modes). If you are working with extremely large data volumes in SSAS then PDW might be something you want to do some research on. For SSAS Multidimensional in MOLAP mode or Tabular models in In-Memory mode, using PDW as a data source should make processing run extremely quickly. For SSAS Multidimensional in ROLAP mode or Tabular models in DirectQuery mode, it can give you interactive query access to data volumes that MOLAP/In-Memory simply couldn’t handle (remember though that DirectQuery only works with DAX queries, so Excel PivotTables don’t work with it, only Power View).

There are a few public sources of information on PDW/SSAS integration. One is the white paper on PDW that you can download from Henk van der Valk’s blog here:
http://henkvandervalk.com/introduction-to-sql-server-2012-parallel-data-warehouse

Here’s the relevant section:

New in SQL Server 2012 PDW, you can use PDW as a high performance relational data source for building multidimensional OR tabular models with SQL Server Analysis Services (SSAS). For example, you can:

· Use DirectQuery to perform real-time queries against a SQL Server PDW data source from a tabular model.

· Reduce ROLAP query times by using clustered columnstore indexes on SQL Server PDW tables.

· Use new features of SSAS. For example, use EnableRolapDistinctCountOnDataSource to run distinct count operations on PDW tables.

· Use PDW as a data warehouse for aggregating Hadoop data for use in multidimensional or tabular models.

The DirectQuery page in Books Online says similar things:
http://msdn.microsoft.com/en-us/library/hh230898.aspx

Some quotes:

In contrast, a tabular model in DirectQuery mode uses data that is stored in a SQL Server database, or in a SQL Server PDW data warehouse. At design time, you import all or a small sample of the data into the cache and build your model as usual. When you are ready to deploy the model, you change the operating mode to DirectQuery. After you change the operating mode, any queries against the model will use the specified relational data source (either SQL Server or SQL Server PDW), not the cached data.

DirectQuery can take advantage of provider-side query acceleration, such as that provided by xVelocity memory optimized column indexes. xVelocity columnstore indexes are provided in both SQL Server 2012 and SQL Server PDW, to support improved DirectQuery performance.

It’s a shame there isn’t more information out there though. This post from Michael Mukovskiy has some interesting findings on using PDW v1 as a data source for SSAS:
http://blog.oraylis.de/2013/07/pdw-cube-processing-experience/
…presumably PDW 2012 would give even better results.

12 thoughts on “PDW and SSAS

    • I have been to the PDW v2 course and the problem is that this is a product that has NDA written all over it. It is impossible to discuss shortcomings and benefits because it is not a standard product. This is not unique to MS though!

  1. Reading Michael’s notes from his experience I presume that PDW v2 may give you much better results comparing to v1. Page compression and Columnstore indexes dramatically improved performance of whole appliance.

    One more to add: the improvement in ROLAP partition processing is in the first part when data is retrieved from your source (PDW in this case).

  2. For more information about PDW, have you seen the website http://www.upgradetopdw.com? There are two papers there.

    The first one gives an overview of PDW and describes the benefits of V2:
    http://download.microsoft.com/download/5/0/1/5015A62E-06BF-4DEC-B90A-37D52E279DE5/SQL_Server_2012_Parallel_Data_Warehouse_Breakthrough_Platform_White_Paper.pdf.

    The second one describes the process of migrating to PDW:
    http://download.microsoft.com/download/4/2/6/42616D71-3488-46E2-89F0-E516C10F6576/SQL_Server_to_SQL_Server_PDW_Migration_Guide.pdf

  3. ROLAP seems to be good idea for scaling SSAS. However, we see not correct behaviour in few cases,which causes performance issues. For example SSAS ‘likes’ to SUM all measures in the fact table even if you need only one and etc. Are there any tips and trics on how to implement ROLAP via MS SSAS?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s