Hacker News new | ask | show | jobs
by alceufc 4333 days ago
Log-binning can be useful. However it has some disadvantages.

I think that in your case your data (server response time?) looks good because you probably have a log-logistic or log-normal distribution.

Suppose you were working with values that are exponentially distributed which is also a reasonable hypothesis for your data. In that case the log-binned histogram would like a plateau with the exception of the beginning and ending bins. In this scenario a linear-binning approach would probably be better.

Unfortunately, I think that there is no approach for bucketing that is good for all situations. Usually the best approach will depend on your data and also on what you are trying to analyze.