R and F#

One of my new year’s resolutions – or at least, something that got added to my list of stuff to do in the unlikely event I’ve got some time spare and can be bothered – was to learn more about statistics. I’ve only got a very basic grasp of the subject but, like data mining, it’s one of those things that seems to promise to be incredibly useful in my line of work. However it’s interesting to ponder that I’ve been working in BI for almost a decade and never so far needed to learn much beyond basic stats; my theory is that stats, like data mining, only tends to be used by highly skilled quantitative analysts, whereas the people I work with are business people whose maths skills are very average and who quite rightly don’t trust analysis done using methods they can’t understand.

Anyway, in my browsing on the subject I came across the all-of-a-sudden popular topic of R (see http://www.r-project.org/), the statistical programming language. I thought it might make an interesting blog entry, but today I saw John Brookmyre beat me too it so I’ll just link to him instead:

I also got interested in learning about F#, the functional programming language that will be included in VS2010 (for a good overview, see http://www.developer.com/net/net/article.php/3784961). I was struck by some similarities with MDX and began to wonder about how it could be applied to BI; and yet again, a quick Google revealed Aaron Erickson had had the same idea and blogged extensively and intelligently on the subject:

It’ll be interesting to watch the uptake of F# in BI; from what I can see there’s already a lot of activity in the area of data manipulation and stats for F# (see for example Luca Bolognese’s blog) and I’m sure it’s only going to grow. The only complaint I’ve got is that here’s yet another addition to the Microsoft BI toolset and I’m yet to be convinced there’s any kind of company-wide strategy aimed at shaping all these tools into a coherent BI strategy. F# won’t be the language of BI in the way that Aaron wants; it’s more likely to end up as a technology island in the way Aaron specifically doesn’t want. But hey, the .NET guys have arrived at the party! The more the merrier.

Intelligencia Desktop Client

DISCLAIMER: since I licensed my SSRS custom data extension for SSAS to iT-Workplace, and since this technology is used in Intelligencia Desktop Client, I benefit financially from sales of this tool!

If you’re a regular reader of this blog, you’re no doubt aware that about a year ago I came up with an idea for a custom data extension for SSRS that makes it much easier to work with SSAS data sources, which subsequently became part of the Intelligencia Query product (which I blogged about here and has since gone through several releases). iT-Workplace, the company that sells Intelligencia Query, also produces a .NET MDX query-generator component suite called Intelligencia OLAP Controls (used in Intelligencia Query) which is aimed at third parties who want to add MDX query capabilities to their own apps, and midway last year I suggested to Andrew Wiles of iT-Workplace that he wrap these components in an exe and create his own standalone desktop client tool – and this became Intelligencia Desktop Client (IDC hereon), which I thought I’d review here in my continuing series on SSAS client tools.

IDC is distinctive because it deliberately doesn’t compete with most other Analysis Services ad hoc query tools – it’s aimed very much at the planning and budgeting market. At present the only version available is the Standard Edition which gives you query building and reporting functionality; at first impressions it does very much what other advanced ad hoc query tools like Proclarity do, but it has a lot of functionality important to financial users such as the ability to construct complex asymmetric sets on axes that many such tools lack. In fact it’s as much about creating forms for budget data entry via writeback as it is for querying and reporting; the closest comparison to make is with the PerformancePoint Excel addin although for it’s people who have built their own financial applications from scratch in Analysis Services rather than used PerformancePoint. The Enterprise Edition, which is still a CTP, will I believe offer yet more data entry and modelling functionality – I think Andrew wants to move towards incorporating cube building capabilities too.

Some features to note:

  • Creating query-based calculations is very easy, and it has an innovative spreadsheet-formula-like approach to doing so that financial users will feel very at home with:
    Unfortunately you can’t copy a calculation from a single cell to a whole range, yet, but I’ve asked for that for a future release…
  • It has a lot options for formatting the resulting table for printing or inclusion in a document:
    This ties in with its more mature sister product Intelligencia for Office 2007 which takes the form of Word and Excel addins, and is aimed at producing print-quality documents which incorporate live links to OLAP data.
  • This formatting functionality is also useful because IDC can publish queries to Reporting Services:
    Depending on what your requirements are this could be a very easy way of generating SSRS reports based on SSAS data. I wouldn’t go as far as to say that it makes IDC a proper SSRS report design tool since it doesn’t support the creation of any of the more advanced SSRS features; in fact IDC doesn’t have any charting capabilities (although I know this might be in the pipeline) so you can’t create reports with charts.
  • It has an ‘MDX Mode’ where you can turn off the navigation pane and enter whatever MDX you want, with the query results being displayed in the grid; very useful for those times when you have to write the MDX for a query yourself. It even has Intellisense!

Netezza launches data integration strategy for Microsoft BI

Interesting press release from Netezza here:

At the moment it only looks like there’s an OLE DB provider available, but the release says this is only the first part of the strategy. I wonder if Netezza is being considered as a supported data source for Analysis Services so it could be used with cubes in ROLAP mode, as with Teradata today?

UK BI User Group meeting next week

OK, this is rather short notice I know but the UK SQL Server User Group is hosting a BI evening event next week, on January 15th at Microsoft’s HQ in Reading:

Since this is the rumoured date for Microsoft laying off 15% of its staff, you might be advised to wear some wellies in case you step in a puddle of blood on the floor. Hopefully all the BI folks will be ok though…

Anyway if you follow the link you’ll see that I’m one of the speakers: I’ll be repeating the session I did at PASS last year on building a monitoring solution for Integration Services, Analysis Services and Reporting Services. Also speaking will be Allan Mitchell, on his new pet hobby of data mining. Hope to see you there!

What hierarchies are actually being used in my cube?

Working out how to monitor Analysis Services usage is something I’ve spent a fair bit of time doing over the last few months; I’ll get around to writing up my SQLBits presentation here soon, I promise, but in the meantime here’s a quick blog entry on a related topic: how can you work out which dimension hierarchies your users are actually using in their queries? It’s useful to know this for a number of reasons, for example if you’re thinking of changing the AggregationUsage, AttributeHierarchyEnabled or AttributeHierarchyOptimizedState properties. Or if you’re just plain curious.

The first step towards answering this question is to capture all the MDX queries that are being run, and you can do this by running a Profiler trace on the Query Begin event and saving it to a table in SQL Server in the way I described here. Once you’ve got the queries you then need to work out which queries mention which hierarchies and you can do this in SSIS using the Term Lookup transform in the data flow:


Unfortunately the Term Lookup transform is a bit picky as far as the input it can use. You can get a list of all the hierarchies in every dimension in every database in Analysis Services by running the following DMV query:
select * from $system.mdschema_hierarchies

To be able to use this in a Term Lookup I had to create a linked server in SQL Server pointing to Analysis Services, then building a view that queried the Analysis Services DMV using the OpenQuery function so:
select cast(hierarchy_unique_name as varchar(100)) as HIERARCHY_UNIQUENAME from openquery([AS], ‘select * from $system.mdschema_hierarchies’)

Having done this, and having run the SSIS package, you then get a table containing one row for each query/hierarchy combination:

Running a select distinct on this table of course gives you a list of all the hierarchies mentioned in your queries:

Using different DMVs would allow you to analyse different aspects of the queries – for example, you could work out which MDX functions were being used in the queries with the mdschema_functions DMV (similar to what Mosha was doing here, but maybe with a view to seeing whether your client tool was using ‘bad’ MDX functions).

%d bloggers like this: