I've been pretty happy with datadog's distribution type [1] that uses their own approximate histogram data structure [2]. I haven't evaluated their error bounds deeply in production yet, but I haven't had to tune any bucketing. The linked paper [3] claims a fixed percentage of relative error per percentile.
That is a very different tradeoff, though. A DDSketch is absolutely gigantic compared to a power-of-four binned distribution that could be implemented as a vector of integers. A practical DDSketch will be 5KiB+. And when they say DDSketch merges are "fast" they are comparing to other sketches that take microseconds or more to merge, not to CDF vectors that can be merged literally in nanoseconds.