Hacker News new | ask | show | jobs
by ejp 4204 days ago
Any benchmarks or anecdata about how this performs with millions of 'paths'? What about data larger than memory?

Also, you mention difficulties with changing RRD settings, like time intervals. How does this handle similar config migration?

I've recently been involved in RRD replacement as well for the same reasons you cite in the article. Our technology choice was OpenTSDB. How does YAWNDB stack up? (Granted, OpenTSDB is in another category of deployment complexity.)

3 comments

Regarding benchmarks: here is a screenshot that I've made during development http://i.imgur.com/wEbn4.png Things to note: 1) ~100k "paths"; 2) ~18.6k RPS; 3) very nice load distribution between CPUs (actually, it will scale almost linearly because of it's architecture); 4) CPU isn't saturated at all (my laptop wasn't able to push enough load). I remember that I've observed a bit of non-linearity in CPU load (there is some sub-linear overhead), so it should handle more than 3x that load.
YAWNDB was never supposed to be a general-purpose DB. Instead, it was a relatively quick hack to store data that fits in memory and is frequently updated. This is why there is no proper migration (or there was, I hope Pavel will correct me if they added migration lately) and larger-than-memory dataset support.

It's also worth noting that there were no OpenTSDB or InfluxDB on the horizon when YAWNDB was hacked together. They are way more powerful and flexible than YAWNDB, but also more complex.

In similar vein, it would be neat to compare this against Blueflood http://blueflood.io/. Disclaimer: I used to be a committer.