Only tangentially related to your comment, but last year there was a series of postings "Damn Cool Algorithms" in which the last post covered HyperLogLog in the post on cardinality estimation [0]. Some discussion in the HackerNews submission [1].
one of the biggest benefits of HLL is that if you are only interested in set cardinalities, you don't have to store the original items at all, just a small HLL data structure which can be used to compute set unions, and with some caveats, intersections.
It would be really great if Redshift supported HLL as a proper data type, and not just an optimization for COUNT(). A proper data type would allow us to store pre-aggregated data instead of billions of unnecessarily granular rows.
[0] http://blog.notdot.net/2012/09/Dam-Cool-Algorithms-Cardinali... [1] https://news.ycombinator.com/item?id=4488946