Can I run SQL against an Analysis Services cube?

I take what is probably an unhealthy interest in the statistics that MSN Spaces generates about visits to this blog, and in particular the searches that people run on Google which land them here. Over the last few months I’ve noticed that the someone has been searching on the phrase "Can I run SQL against an AS cube?" on a fairly regular basis; since I mentioned the fact that you can in a post a while ago, I assume that’s why they always come here, but I thought it would be good to answer the question in more depth. Never let it be said that I’m not responsive to the needs of my audience – although I’ll draw the line at pandering to the many people who seem to be looking for pictures of "hot bi action"…
 
In fact, you’ve always been able to run SQL against Analysis Services right back to the days of OLAP Services. Traditionally this was only useful when you wanted to create local cubes from server cubes and the subset of SQL supported by AS was so limited you wouldn’t want to use it for anything else; the fact that this stuff isn’t documented anywhere, possibly intentionally, didn’t help. However, when I started to do some research into AS2005 SQL I was pleasantly surprised at how much I could do. My starting point was to look at some of the SQL queries that are generated when AS2005 creates local cubes. Take the following MDX statement which uses the CREATE GLOBAL CUBE syntax to create a local cube from Adventure Works:
 
CREATE GLOBAL CUBE [Test]
STORAGE ‘c:\MyCube.cub’
FROM [Adventure Works]
(MEASURE [Adventure Works].[Internet Sales Amount],DIMENSION [Adventure Works].[Geography],DIMENSION [Adventure Works].[Product])
 
If you run a profiler trace while this executes, you’ll see a whole bunch of SQL statements are generated in the background to get the data to populate the local cube. Here are two representative examples:
 
SELECT
  DISTINCT
  KEY ( [Adventure Works].[$Product].[End Date],0 )
  AS [oduct0_0], NAME ( [Adventure Works].[$Product].[End Date] )
  AS [oduct0_1], MemberValue ( [Adventure Works].[$Product].[End Date] )
  AS [oduct0_2]
   FROM [Adventure Works].[$Product]
 
SELECT  AGGREGATE ( [Adventure Works].[Internet Sales].[Internet Sales Amount] )
  AS [Sales0_0], KEY ( [Adventure Works].[$Product].[Product],0 )
  AS [oduct1_0]
   FROM [Adventure Works].[Internet Sales]
    NATURAL JOIN
   [Adventure Works].[$Product]
  GROUP BY [oduct1_0]
 
The first thing to notice is that the ‘tables’ we’re querying are either dimensions or measure groups. A dimension ‘table’ has the naming convention [CubeName].[$DimensionName] (note the dollar sign) and a measure group ‘table’ has the naming convention [CubeName].[MeasureGroupName]. We can obviously do joins between them using the NATURAL JOIN syntax; we can also do GROUP BYs and use functions like AGGREGATE (which I assume aggregates the measure value by its cube aggregation function), KEY, NAME and MEMBERVALUE (which as far as I can see allow you to retrieve the key, name and membervalue properties associated with a dimension attribute). My memory might not be entirely accurate on this but I’m fairly sure that none of the above could be done with AS2000 SQL. You can also do WHERE clause filtering too, but it looks like you can only AND conditions and not OR them, so

SELECT AGGREGATE ( [Adventure Works].[Internet Sales].[Internet Sales Amount] )AS [Sales0_0], KEY ( [Adventure Works].[$Product].[Product],0 ) AS [oduct1_0], [Adventure Works].[$Product].[Product],
[Adventure Works].[$Date].[Calendar Year]
FROM [Adventure Works].[Internet Sales]
NATURAL JOIN
[Adventure Works].[$Product]
NATURAL JOIN
[Adventure Works].[$Date]
WHERE [Adventure Works].[$Product].[Product] = ‘Mountain-100 Black, 48’
AND [Adventure Works].[$Date].[Calendar Year]=’CY 2002′
GROUP BY [oduct1_0], [Adventure Works].[$Date].[Calendar Year]

 

runs, whereas
 
SELECT AGGREGATE ( [Adventure Works].[Internet Sales].[Internet Sales Amount] )AS [Sales0_0], KEY ( [Adventure Works].[$Product].[Product],0 ) AS [oduct1_0], [Adventure Works].[$Product].[Product],
[Adventure Works].[$Date].[Calendar Year]
FROM [Adventure Works].[Internet Sales]
NATURAL JOIN
[Adventure Works].[$Product]
NATURAL JOIN
[Adventure Works].[$Date]
WHERE [Adventure Works].[$Product].[Product] = ‘Mountain-100 Black, 48’
OR [Adventure Works].[$Date].[Calendar Year]=’CY 2002′
GROUP BY [oduct1_0], [Adventure Works].[$Date].[Calendar Year]
 
produces an error, which limits its usefulness. Nor can I get any query which uses the COUNT function to work, for example:

SELECT COUNT(*)
FROM [Adventure Works].[Internet Sales]
WHERE [Adventure Works].[$Product].[Product] = ‘Mountain-100 Black, 48’
AND [Adventure Works].[$Date].[Calendar Year]=’CY 2002′

 
gives an error. So while we’ve got the potential to do some useful things here, it isn’t exactly the most useful implementation of SQL I’ve ever seen. There are some other pointers to other functions that are supported in the Analysis Services 2005 cartridge, found in
C:\Program Files\Microsoft SQL Server\MSSQL.5\OLAP\bin\Cartridges\as90.xsl
on my machine. This is the file that contains the instructions for AS on how to generate the SQL used for processing cubes, and there are what looks like several other functions mentioned in here that could be worth looking at.
 
But do we want to use SQL to query Analysis Services anyway? I’ve talked about this before, here:
I don’t think so, and it seems to me that although there are some things that are easier to express in SQL than MDX the acutal subset of SQL that is implemented in here is crippled in some important respects. MDX is so much better suited for BI queries and although I know a lot of people struggle with it at first, it’s definitely worth the pain in the end. 
 
 

8 thoughts on “Can I run SQL against an Analysis Services cube?

  1. >> Never let it be said that I\’m not responsive to the needs of my audience – although I\’ll draw the line at pandering to the many people who seem to be looking for pictures of "hot bi action"…
     
    LOL!!!

    1. You probably shouldn’t use local cubes anyway – there are many problems with them, and performance is often bad. Have you considered PowerPivot instead?

  2. great article
    The reason some of us (well, me) want to query SSAS cubes from SQL is that we want to combine different data (purchase orders and safety stock by branch) with total sales (which is already accumulated in the sales cube and would be tedious to aggregate again (so we can work out how many weeks’ stock we have from the sales history and the current onhand).
    Maybe I should figure out how to combine in a single cube simple accumulative cell data (sales by product by day by site by customer) with non-accumulative data (current on-hand, safety stock level, current outstanding purchase orders and current unfulfilled sales orders. for this type of data the time dimension does not work: you can’t add it up.
    Mark :Lockett

    1. Hi Mark,

      It’s a very common requirement to combine different types of data in a cube in the way you describe, and SSAS is capable of doing this. It’s all a matter of how you model the data, and how you design the cube to handle how the data aggregates up – have you looked at semi-additive measures, for example? Even more complex types of aggregation can be implemented with MDX.

  3. This syntax doesn’t seem to work with DMVs. Is there a way to do a join with the tables below to get the table name on the first query?
    select TableID, [Name] as PartitionName, QueryDefinition from $SYSTEM.TMSCHEMA_PARTITIONS;
    select [ID] as TableID, [Name] as TableName from $SYSTEM.TMSCHEMA_TABLES;

Leave a Reply to It just works–but why? « Chris Webb's BI BlogCancel reply