Voting has just opened for the PASS Board of Directors, and if you’re eligible you should have received (or should receive soon) an email with instructions on how to vote. Obviously you should vote, because otherwise your interests won’t be represented at Board level. Full details on the elections can be found here: http://www.sqlpass.org/Elections.aspx
I’ve known James for a long time through working with him on the SQLBits committee. He’s a great guy, a friend and someone whose organisational abilities and personal integrity I respect – I know he’d do a great job if he’s elected.
I agree with his ideas on the PASS Global Growth Initiative, something I blogged about a few months back. This is something he’s been very closely involved with since it started, and he’s the only candidate who mentions global growth in his platform (indeed he’s the only candidate from outside the US).
The other candidates are very strong and I’d urge you to check out what they have to say in their platforms. However if you agree with me that PASS needs to become more responsive to the needs of the worldwide SQL Server community, then I’d urge you to vote for James too.
It’s been a while since I’ve written about any third-party SSAS client tools here, isn’t it? This is partly because there aren’t as many of them around as there used to be; if I’m honest, it’s also because I find writing product reviews pretty dull as well. That said, I’m always interested to see demos of these tools, and my customers are always asking me about what’s available so I need to keep my knowledge up-to-date.
I’d like to point out before we go any further that his post is NOT a product review, but some thoughts that occurred to me after seeing one of these client tool demos.
Anyway, few months ago I was given a demo of a tool called Deltamaster, sold by a German company called Bissantz. Now if you’re reading this in Germany (or Austria, or Switzerland), you’re probably wondering why I’m writing about something that’s been around almost as long as SSAS itself – I’ve certainly known about it for years although I’ve never properly played with it. If you’re reading this elsewhere, though, you probably won’t have heard of Deltamaster because it isn’t widely sold outside its home market. It’s a traditional, full-featured desktop SSAS client tool tool that does all the things you’d expect a traditional, full-featured desktop SSAS client tool to do. It does PivotTable-like things. It does charts and sparklines. It allows you to save multiple views in briefing books. It has menus coming out of its ears and hundreds of different options for doing things. The UI is, if anything, a bit too busy and slightly old-fashioned looking, but it does everything you’d want it to.
What really caught my attention, though – and I’m sure this is a feature its had for ages – was the range of guided analyses it has built-in. With just a few clicks and the selection of a few parameters, you can do some very sophisticated stuff. Here’s the Concentration Analysis (aka ABC analysis) report that it produced for months on the Adventure Works cube, complete with colour coding, chart and all the working:
A distribution analysis (again, notice the stats in the box on the right hand side):
Even some impressive-looking data mining stuff that I don’t quite understand (I should RTFM):
I’ve seen this kind of thing before, but Deltamaster does this well and has by far the largest number of different types of analysis available. And all this made me think, why don’t more tools do this? Why doesn’t Excel feature this kind of functionality?
Data visualisation tools like Tableau have done well by making something that’s difficult and easy to get wrong – data visualisation – much easier, by pointing you in the right direction and stopping you doing things you shouldn’t be doing. You think you want a pie chart? You don’t get it, because pie charts are a Bad Idea. You get what’s good for you, not what you want. What Deltamaster is doing (and I think it’s an idea that could be taken a lot further) is the same thing but for data analysis, statistics and data mining. Now I know next to nothing about statistics and while I’m not proud of that fact, I’ve only managed to survive as long as I have in the BI world because my customers know less about statistics and data analysis techniques than I do. So far, the big struggle in BI has been to present the correct figures in a table with reasonable performance. The next problem in BI, once the data has been delivered, is to make sure business people interpret it properly. This is what good data visualisation tools do, and I think this is what guided analysis functionality could do as well. Are sales really going up, or is this seasonal variation? Is there a correlation between running promotions and increased sales? Does a customer’s gender, age, occupation or education level tell us anything about their likelihood of buying from us? At the moment there are plenty of BI tools that give us the ability to answer these questions if we know what we’re doing, but most of us don’t.
So, the key thing though is not to provide lots of types of guided analyses, but to make them easy to use and difficult to make mistakes with. If I was to criticise Deltamaster it would be because it provides a whole bunch of analyses that spit out graphs and stats, but it doesn’t go far enough to help you choose which type of analysis is right for your business problem, to help you choose the right parameters to pass in, and to help you make sense of the results which are returned; it’s still well ahead of most of the competition though. Would some level of user education always be necessary? Would the tool need to know about the data it’s working with, and the business problems associated with that data? To some extent maybe. I still think there’s a lot of room for improvement on what we’ve got today though.
I read an interesting article by Stephen Swoyer today on the TWDI site today, about a new Gartner report that suggests that companies should start selling the data they collect for BI purposes to third parties via public data marketplaces. This is a subject I’ve seen discussed a few times over the last year or so – indeed, I remember at the PASS Summit last year I overheard a member of the Windows Azure Marketplace dev team make a similar suggestion – and I couldn’t resist the opportunity to weigh in with my own thoughts on the matter.
The main problem that I had with the article is that it didn’t explore any of the reasons why companies would not want to sell the data they’re collecting in a public data marketplace. Obviously there are a lot of hurdles to overcome before you could sell any data: you’d need to make sure you weren’t selling your data to your competitors, for example; you’d need to make sure you weren’t breaking any data privacy laws with regard to your customers; and of course it would have to be financially worth your while to spend time building and maintaining the systems to extract the data and upload it to the marketplace – you’d need to be sure someone would actually want to buy the data you’re collecting at a reasonable price. Doing all of this would take a lot of time and effort. The main hurdle though, I think, would be disinterest: why would a company whose primary business is something else start up a side-line selling its internal data? It has better things to be spending its time doing, like focusing on its core business. If you sell cars or operate toll roads why are you going to branch out into selling data, especially when the revenue you’ll get from doing this is going to be relatively trivial in comparison?
What’s more, I think it’s a typical piece of tech utopianism to think that data will sell itself if you just dump it on a public data marketplace. Maybe apps on the Apple App Store can be sold in this way, but just about everything else in the world, whether it’s sold on the internet or face-to-face, needs to be actively marketed and this is something that the data generators themselves are not going to want to make the effort to do. As I said earlier, those companies that are interested in selling their data will still need to be careful about who they sell to, and the number of potential buyers for their particular data is in any case going to be limited. Someone needs to think about what the data can be used for, target potential customers and then show these potential customers how the data can be used to improve their bottom line.
For example, imagine if all the hotels around the Washington State Convention Centre were to aggregate and sell information on their bookings for the next six months into the future to all the nearby retailers and restaurants, so it was possible for them to predict when the centre of Seattle would be full of wealthy IT geeks in town for a Microsoft conference and therefore plan staffing and purchasing decisions appropriately. In these cases a middle man would be required to seek out the potential buyer and broker the deal. The guy that owns the restaurant by the convention centre isn’t going to know about this data unless someone tells him it’s available and convinces him it will be useful. And just handing over the data it isn’t really good enough either – it needs to be used effectively to prove its value, and the only companies who’ll be able to use this data effectively will be the ones who’ll be able to integrate it with their existing BI systems, even if that BI system is the Excel spreadsheet that the small restaurant uses to plan its purchases over the next few weeks. Which of course may well require outside consultancy… and when you’ve got to this point, you’re basically doing all of the same things that most existing companies in the market research/corporate data provider space do today, albeit on a much smaller scale.
I don’t want to seem too negative about the idea of companies selling their data, though. I know, as a BI consultant, that there is an immense amount of interesting data now being collected that has real value to companies other than the ones that have collected it. Rather than companies selling their own data, however, what I think we will see instead is an expansion in the number of intermediary companies who sell data (most of which will be very small), and much greater diversity in the types of data that they sell. Maybe this is an interesting opportunity for BI consultancies to diversify into – after all, we’re the ones who know which companies have good quality data, and who are already building the BI systems to move it around. Do public data marketplaces still have a role to play? I think they do, but they will end up being a single storefront for these small, new data providers to sell data in the same way that eBay and Amazon Marketplace act as a single storefront for much smaller companies to sell second-hand books and Dr Who memorabilia. It’s going to be a few years before this ecosystem of boutique data providers establishes itself though, and I suspect that the current crop of public data marketplaces will have died off before this happens.
A few weeks ago, when I was playing around with the new WebService() function in Excel 2013, I was disappointed to learn that it didn’t work with services that require authentication – a pretty big limitation in my opinion – so, for example, it meant I couldn’t use it to import data from the Azure Marketplace into Excel. You might be wondering why I’d want to do this, when there’s already built-in functionality for importing Azure Marketplace data into Excel 2013 and an addin to do this for earlier versions of Excel; the reason is that I want to be able to dynamically construct the url used to call the service inside the spreadsheet. For example, in this post I’m going to be using the Bing Search API and I want to be able to enter the search term I pass to it in a cell in a worksheet rather than hard-code it in the connection, which is what happens when using the native functionality. I’m sure I could use some VBA to do the same thing but using VBA always seems like an admission of defeat to me, so in this post I’m going to show how you can use Excel Web Queries to do it instead.
So let’s start with the Bing Search API, one of the many APIs and datasets available via the Windows Azure Marketplace. There are plenty of posts explaining how to use the Azure Marketplace website to import data into PowerPivot (see here for example) so I won’t go into too much detail at this point, but here’s an example Url that returns the top 15 news results for the search term ‘Microsoft’:
Next, you need to fire up Excel 2013 and create a Web Query, a feature that’s been in Excel since ohhhh, Office 2000. There are a lot of good articles on the web describing how this feature works but here’s one of the best I found: http://www.vertex42.com/News/excel-web-query.html
Specifically, you need to go to the Data tab and click the From Web button:
This opens the New Web Query dialog; you then paste the Url from the Azure Marketplace into the Url box and click go. You’ll be prompted for a username and password, and you need to enter your Azure Marketplace account key for both. Click on the small yellow arrow in the top left hand corner of the browser window and you’ll see something like this:
Do not press Import at this point though! Instead, press the Save Query button in the toolbar (highlighted in the screenshot above) and then save the query to disk as an .iqy file and click Cancel. Next, find the .iqy file you’ve just saved and open it in Notepad. The contents will be something like this:
WEB 1 https://api.datamarket.azure.com/Data.ashx/Bing/Search/v1/News?Query=%27microsoft%27&$top=15&$format=Atom
What you need to do now is to parameterise the Url in this file so that, instead of hard-coding it to search for news about microsoft (as in this example) you can enter your own search term. The link above describes how to do this in detail, but here’s the parameterised version of the Url for reference:
You then need to save the file and go back to Excel. First, enter a search term in a cell in a worksheet. Then, go to the Data tab, click the Existing Connections button and then the Browse For More button and select the .iqy file. The next step is to choose a cell in a worksheet to dump the data to, and then when prompted for the parameter value click on the cell where you entered your search term and check the Use This Value/Reference For Future Refreshes and Refresh Automatically When Cell Value Changes boxes.
Click OK and you’ll be prompted for credentials again – although Excel will remember them – and the query will run, resulting in some XML appearing in the cell you selected:
This is ugly, and indeed Excel has the ability to import XML returned from a Web Query as XML – but what I’ve found is that if you don’t use the approach I’m showing here, you won’t be able to refresh you query properly.
So you now have our search results as XML and you want to be able to display them in a more meaningful way. This is where the new FilterXML function in Excel 2013 comes into its own. Create an Excel table with four columns: Rank, Title, Description and Url; then enter numbers from 1 to 15 in the Rank column (there are fifteen results returned from the Bing search). You can then use Excel table formulas like this one to get the Title, Description and Url from the XML and display it in the table:
(I’m very grateful to Phil Quinn, who I met at SQL Saturday 162, for helping me with the XPath here – I spent ages trying to get it to work with no luck and he managed to provide a working query in 2 minutes)
With this done, you now have the ability to enter a search term in your worksheet and automatically get Bing search results displayed in an Excel table. Of course, this approach would work with any dataset from the Azure Marketplace and indeed any OData source – for example, now that SSRS 2008 R2 and greater can render reports to OData, it should be possible to consume data from an SSRS report in a much more elegant way than the method I described here, because you’d be able to pass parameters from Excel to SSRS (hmmm, maybe this needs its own blog post). Of course, now you have a table in Excel you can add it to the Excel Model and do all kinds of interesting PowerPivot-y things with it.
What this really highlights, though, is that it should be much easier to parameterise the queries used to load data into the Excel Model/PowerPivot, whether they are OData data sources, SQL, MDX or DAX. Not being able to parameterise these queries means you increase the temptation to load all the data that might ever be needed by a user into PowerPivot; if it were easier to parameterise these queries then it would encourage PowerPivot users to build solutions where they only imported the data they actually needed to work with at any given moment.
If you’re going to be at one of these events, come and say hello! Next year we’ll hopefully have another SQLBits, although we don’t have a date fixed yet, and hopefully I’ll be speaking at Database Days in Switzerland too.