Hacker News new | ask | show | jobs
by onlyrealcuzzo 1437 days ago
With 98% accuracy, a Hyperloglog can count ~4Bn distinct 64-bit integers in ~1.5Kb.

Without a data structure, this would take GigaBytes (1M times the space).

That's pretty incredible to me.

Do the combinatorics of every possible combination of 4B distinct 64-bit integers that that 1.5kb can represent.

Mind blowing.

2 comments

If I did the calculations correctly, that's over 20 Exabytes worth of data that ~1.5kb could represent with ~98% accuracy.

20 Exabytes is probably close to all the information stored in Google...

I think what really fastinates me is how much better we can solve a problem if we allows some approximation/uncertainty.