Seeing that Marco Russo has released his book "Programming Microsoft LINQ" reminded me of a conversation I had with him a while ago about something I’ve heard various people ask about over the last year – will there be a LINQ to MDX?
Before we go on, I should state that it’s my opinion that there isn’t a big enough market out there for anyone (Microsoft or a third party) to justify spending time developing LINQ to MDX. That’s not to say that I wouldn’t want to see it – I would – just that I doubt anyone much would use it. MDX remains too much of a niche language, and off-the-shelf tools work well most of the time so there’s less need to write custom MDX-generation code. As far as I know Microsoft isn’t planning on developing LINQ to MDX and I’d be surprised if it ever did, so this will remain a theoretical discussion.
But for the sake of argument if you were to implement LINQ to MDX the main problem you’d have to tackle would be the same one you have with using MDX in Reporting Services and Integration Services: MDX can’t guarantee fixed column names for any given query. However I had an idea on how to avoid this, and that is to think in terms of LINQ to MDX sets rather than LINQ to MDX queries. So for example if you take the following SQL query on a dimension table:
Select Year, Quarter, Month
From TimeDim
Where Year=2008 and (Month=March or Month=April)
that would then translate easily into the following MDX set expression:
{([TimeDim].[Year].[2008], [TimeDim].[Quarter].[Q3], [TimeDim].[Month].[March]), ([TimeDim].[Year].[2008], [TimeDim].[Quarter].[Q3], [TimeDim].[Month].[April])}
MDX sets have to be made up of tuples containing the same dimensionality, and if you think of a set’s dimensionality in terms of columns in a SQL SELECT statement then you can see how that might map onto LINQ concepts. Instead of using LINQ to create an MDX SELECT statement directly, you’d use LINQ to create MDX sets and then pass all of these sets into another function which would then run the query using the sets created as axes.
Since I’m no LINQ expert this was the point where I dropped a mail to Marco to ask him his opinion; he thought it was feasible and even came up with an idea of what the code might look like in C#:
var timeSet = from period in cubeBudget.TimeDim.Members
where period.Year == 2008
&& (period.Level == TimeDim.Level.Year
|| period.Month == "March"
|| period.Month == "April" )
select period;
Var measures = from measure in cubeBudget.Measures
Where new string[] { "Sales", "Quantity", "Price" }.Contains( measure.Name )
select measure;
Var query = from measure in measures
from period in timeSet
select new { period.Name, measure.Name, measure.Value };
var cellset = from cell in cubeBudget
where cell.Columns( measures )
&& cell.Rows( timeSet )
select cell;
In the code above, timeset would define the set of members on the Period dimension you wanted to use and measures would return the set of measures. Then you could use them in two ways: query would return a flattened rowset and cellset would return a cellset. But this all seems very convoluted and probably just as confusing to the average developer as raw MDX.
Another alternative approach would ignore MDX altogether and query Analysis Services using SQL directly (see http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!751.entry for more details on this topic), although I’m sure you’d run into the limitations of the SQL that is supported very quickly and in any case you lose all of the flexibility and functionality of the MDX language when you do this.
Maybe if we’re looking for a way to programmatically generate MDX then LINQ isn’t the way to do it. It’s something olap4j is working towards and they have taken an approach that is much more in tune with the multidimensional nature of MDX. One of the ‘open issues’ that caught my eye in the section of the olap4j spec that deals with this functionality is the question "Is this API at the right level, or is it too close to MDX?", meaning I guess that it would be all too easy to come up with an interface that is just as complex as MDX itself. Where would you draw the line between ease-of-use and functionality? Is it the MDX language that is confusing for people, or the multidimensional concepts (concepts that any interface would have to reflect) that underpin it? Can you abstract MDX to an interface to make it easier to use? I wish I knew more Java so I could test drive olap4j – is there anyone reading this who is using it? As always, I’d be interested to hear anyone’s thoughts on this matter so please leave some comments…