Hacker News new | ask | show | jobs
by spenczar5 2484 days ago
Thanks for publishing this. I admit I have only skimmed the paper and plan to look closer later today. My first critical thought was "I wonder why section 4 doesn't make comparisons to t-digest?" I think of t-digest as the most common mergeable streaming modern quantile algorithm in practice. Why didn't you include it in your comparisons?

Either way - looks very promising, I am excited to take a closer look and possibly use this. Thanks!

1 comments

T-digest is definitely the best rank-accuracy sketch for higher percentiles in terms of size. However, it was much slower than GK, and we found that by increasing the rank-accuracy of GK (taking the penalty of a larger sketch-size), the results were not too different. Both of course had the issues inherent in all rank-accurate sketches in the higher percentiles over long-tailed data.