Hacker News new | ask | show | jobs
by jjirsa 3361 days ago
A future clustering version of TimescaleDB will reinvent problems already solved by Cassandra, Riak, and similar databases, and the level of effort in doing so will be far greater than the level of effort needed to add data exploration and complex queries on top of cassandra in the first place (that is: clustering is the hard part, and had this team spent time to bolt the analytics logic onto cassandra similar to what FiloDB did, they'd have the best of both worlds already).

Wraparound vacuuming still sucks for high write workloads. I've been there. I've fought that problem in a high-write-throughput-no-delete-immutable-workload. I've seen it in person. You're still writing a lot of frozen txids to disk. Your slaves are still going to get the WAL command in a single-threaded WAL sender and fall behind in replication as that vacuum runs. You're still going to have pain trying to create an HA setup.

1 comments

Vacuuming has traditionally been a problem with large table sizes. In TimescaleDB we break up the tables so that they are smaller. That, combined with the new freeze map feature in Postgres (since PG 9.6: http://rhaas.blogspot.jp/2016/03/no-more-full-table-vacuums....) make vacuums not an issue for us. Certainly we've never seen this issue (even on default autovacuum settings) and we've tested some huge datasets.