Hacker News new | ask | show | jobs
by tsenart 2865 days ago
How does it compare to https://github.com/tdunning/t-digest?
2 comments

Author here Some benchmarks on insertion

---

BenchmarkMetrics/Add/streadway/quantile-8 5000000 358 ns/op

BenchmarkMetrics/Add/bmizerany/perks/quantile-8 5000000 291 ns/op

BenchmarkMetrics/Add/dgrisky/go-gk-8 5000000 363 ns/op

BenchmarkMetrics/Add/influxdata/tdigest-8 5000000 250 ns/op

BenchmarkMetrics/Add/axiom/quantiles-8 10000000 208 ns/op

---

I think its the fastest for insertion

Querying need finalization of state then its just pretty fast but will comment once i can get the API into a friendlier state :D

Aren't the goals of t-digest a little bit different?

T-digest seeks to have a bounded size and an error proportional to q*(1-q), hence it gives up quantile accuracy in the middle of the distribution when under load. This algorithm seems to provide total bounded error without small but unbounded size.

Could you elaborate on the differences a bit deeper? I’m really interested in understanding.
http://web.cs.ucla.edu/~weiwang/paper/SSDBM07_2.pdf is the paper its mostly based on Figure 1. Actually describes how big the datastructure can get. It keeps getting bigger the more data you feed it.