|
|
|
|
|
by sethammons
3309 days ago
|
|
They addressed your second point in the article. On a popular post, you would be storing several megabytes of data to capture/relate each unique user that visited. That gets expensive at scale. HLL takes then down to a few kilobytes, less than 1% of the original size. For your first suggestion, you would have to do a very expensive look up. You couldn't cache it effectively​ due to the requirement of near real time stats. You could improve look up time using columnar storage, but the performance and memory usage will be nowhere near as nice as with HLL. Problems are harder at scale. |
|