Distinct Count White Paper

Yet another excellent paper on optimising distinct count measures from the SQLCat team:

Actually I’m beginning to wonder whether I should be linking to the SQLCat team site – in the same way I never link to Mosha because I assume that everyone who reads my blog reads his too, then I would hope everyone subscribes to the SQLCat team blog as well.

One topic missing is a comparison of the performance of distinct count measures with the technique of using many-to-many dimensions to get the same result that Marco Russo describes in his famous m2m white paper:

Marco presented on this at PASS Europe and mentioned (which tallies with my experience) that this approach can perform as well as, and sometimes better than, a distinct count measure.

  1. Thanks for the comments!  We did have some customer evidence that the many-to-many approach got slower in comparison to this version of Distinct Count Optimization as the we started hitting enterprise sizes.  Saying this, I wouldn\’t say that this is conclusive – in the end both methodologies definitely stress the need for performance tuning for distinct count calculations, eh?!

