Hacker News new | ask | show | jobs
by heinrichhartman 2484 days ago
Nice work! Averaging percentiles is well-known to give terrible results. Glad to see more people, taking this problem serious, and providing viable alternatives!

A note on Accuracy: At Circonus, we have been using a version of HDR-Histograms [1] for many years to aggregate latency distributions, and calculate accurate aggregated percentiles. Accuracy was never a problem (worst-case error <5%, usually _much_ better).

If I read your evaluation results correctly, you also found HRD-Histograms to be as-accurate or more-accurate, than DDSketches, correct?

The differentiator to HDR Histograms seems to be merging speed and size, where DDSketches seem to have an edge.

One thing that is not immediately clear to me from reading the paper is, how much of the distribution function can be reconstructed from the sketch? E.g. for SLO calculations one is often interested in latency bands: "How many requests were faster than 100ms?" [2].

Is it possible to approximate CDF values ("lower counts") from the sketch with low/bounded error?

[1] https://github.com/circonus-labs/libcircllhist

[2] http://heinrichhartmann.com/pdf/Heinrich%20Hartmann%20-%20La...

2 comments

HDR is great for the use-case when you can bound your range beforehand and merging is not a requirement, but those were also the reasons we needed to develop DDSketch.
Sorry, but I don't follow this argument:

(1) HDR-Histogram merges are 100% accurate and very fast (few microseconds)

(2) The range of HDR-Histogram is bounded in the same way that floating point numbers are bounded. Hence the name "High Definition Range":

> For example, a Histogram could be configured to track the counts of observed integer values between 0 and 3,600,000,000,000 while maintaining a value precision of 3 significant digits across that range.

http://hdrhistogram.github.io/HdrHistogram/

Circllhist offers a default range of 10^-128 .. 10^+127, which has been more than ample for all use-cases I have seen.

https://github.com/circonus-labs/libcircllhist/blob/master/s...

Yes, the sketch can quickly answer questions like "How many requests were faster than 100ms?" 100ms maps to a bucket index, and we would just sum up all the buckets with indices at most that.