Hacker News new | ask | show | jobs
by davidjc1 2780 days ago
There are some properties of the data that can be exploited to add weaker consistency guarantees. This leads to some desirable design trade-offs in terms of simplicity and performance optimisation. While this could result in data loss, it may be permissible given that queries can span large time ranges where one or two missing datapoints do not carry the same weight as a financial miscalculation, or loss of life. The same could be said with multiplayer games played over mobile devices, with intermittent connectivity issues. In this domain, the player's moves are fast forwarded once connectivity is restored, as this provides no observable difference to other players. My point is that it's very dependent on the use case, and does not apply across the board.
1 comments

There's nothing wrong with a special-purpose tool for building approximate graphs, but calling it a "time-series database" or even quoting "inserts-per-second" is intellectually dishonest.
Many SSDs only write 4kb blocks, and writing a 64bit datapoint uncompressed to disk would not only be slow, but it would result in write amplification and wear out the disk sooner. The solution that many TSDBs, including Prometheus and Influx, involves in-memory batching with a backing WAL log file. If the in-memory batch or WAL log is lost, you would lose data as well.