Hacker News new | ask | show | jobs
by daniele_dll 1373 days ago
I would love to sit and write more documentation but I am doing this work during the night and/or over the weekends so sorry for not having it but I promise I will slowly slowly start to put together something more than a "TODO", even if it's a general intro.

The timeseriesdb is "half" a consequence of having a Write-Ahead-Log that is split in chunks and it's chained. I am saying "half" because the other one is the future addition of secondary index to make it possible and easier to query the internal db properly.

I know that we are talking mainly about Redis right now, but that's just the tip of the iceberg and my long term vision is to build a much more complete and flexible platform which can easily handle streams (e.g. via a Kafka interface) and/or allow you to run more evolved data processing via WASM (e.g. I want yo make possible to calculate rolling window averages in a time-sensible fashion :)).

1 comments

So the TSDB is being recorded, it’s just not queryable, or it’s not binary compatible with whatever the future in disk format will look like?

Is the time series replicated and merged between nodes or does each have its own log for the keys it manages?

Currently both, the TSDB is a PoC right now.

The timeseries will depend on the keys, as it's the historical sequence of values, so it will be replicated and merged on the nodes that own that specific key in active-active replication with the required replication mode.

Although not mandatory, for the active-active replication mode cachegrand will provide a front-end proxy that "understands" which is the correct node for a key and send the necessary data to always the same node (or subset of nodes, depending on the configuration).

The replication itself will be last-write-wins, if the order of the writes matter it will be important to write always to the same node (very common pattern to reduce syncing and locking on the replication side).