The third and final keynote at the PASS Summit, and this morning I’ve been given a space on the official blogger table at the keynote! Actually this just means I’ve got a table to rest my laptop on and access to a power strip, but it’s an honour nonetheless.
There are several things that I saw yesterday that are worth mentioning. Probably the most interesting session was from Cathy Dumas about Tabular: among other things she demoed a DAX Editor plugin for BIDS SQL Server Data Tools that is going to make everyone’s life soooo much easier; it will give us something like an MDX Script editor, intellisense, colour coding and so on. She has blogged about it here and I can’t wait for it to be made available. Also I came across the Data Explorer team site and blog; if you are interested in getting to play with it when it’s ready then you get your email address added to the invite list.
Anyway, back to the keynote and today it’s PASS Summit favourite Dr David DeWitt covering Big Data. It’s not a standard marketing session, more of a lecture, and all the better for it; DeWitt is a very talented speaker and more importantly takes a balanced approach to describing the SQL vs NoSQL debate. Interesting points to note:
- He thinks that the arrival of NoSQL is not a paradigm shift, in the way that the move from hierarchical databases to relational databases was. The assertion that SQL is not dead, not surprisingly, goes down well with this audience.
- Hadoop. I’m not even going to try to summarise this section of the talk but it is an excellent introduction to how it works, and if you’re even vaguely interested in Hadoop (which you should be given Thursday’s announcements) then you need to watch this – I believe will be available to view on demand somewhere (the slide deck is here). It is, honestly, the best explanation of all this I’ve ever seen and there are several thousand people here in this room who agree with me…
- He does a comparison of Hive vs Parallel Data Warehouse v.next on the same queries, same data and same hardware, and shows that PDW can outperform Hive by up to 10x. This demonstrates that a parallel database still has advantages over a NoSQL approach as far as performance goes in many cases, although of course each has its own strengths and weaknesses and performance isn’t the only consideration.
This was not only an enthralling and educational talk, but it was also great marketing from Microsoft’s point of view. You can announce Hadoop for Windows to a room full of SQL Server types and however many whizzy demos you do, and however much woo-hooing goes on, if we don’t really understand the technology we’ll go back to our day jobs and ignore it. On the other hand, teach us what the technology actually does and you’ll get us interested enough to try it out for ourselves and maybe even use it on a project.
Finally, if you’re at the Summit today come and find me at the Birds of a Feather lunch on the SSAS Performance Tuning table, or later on the BI Expert Pod this afternoon.