Hacker News new | ask | show | jobs
by gtrubetskoy 3908 days ago
I believe that what you're referring to as "a lot work to do on top of that" is the correct solution, the ultimate OSS project.

The fallacy of the many time series db's out there is that they discarded the relational database as a viable storage option prematurely and are now trapped solving the very hard problem of horizontally scalable distibuted storage that takes many years to solve instead of focusing on the time-series aspect of it.

Sooner or later something along the lines of pg_shard will become standard in PostgreSQL and other databases, thus you don't really need to write your own sharding logic, you just have to wait. OR you can write it if you want. You have options.

Aggregations is what GROUP BY is for. Removing old data is a non-issue if you're using a round-robin approach (see my blog link), it also makes aging different series at different rates easy.

1 comments

It's good to have competition in this area. Influx is giving me whiplash (it's on its third database engine in six months!)

The circular buffer approach is fine, but it does have the drawback of being unable to represent variable-length data (like key-value pairs). It's also harder to compress the data.

Can a GROUP-BY do windowed aggregations? Like, take an average over ten-minute windows? My SQL knowledge is not great.