Hacker News new | ask | show | jobs
by tyingq 2233 days ago
Looking at the product comparison chart[1], there's quite a lot of features that aren't in the open source "core" CockroachDB. That's fine of course, but the one that seems concerning is the backup/restore. Is there a reasonable and reliable way to do backups and restores with just the "core" open source product?

[1] https://www.cockroachlabs.com/compare/

4 comments

No, there isn't.

The only option is the dump command (full backup). But it's slow. And the restore is unreasonably slow. If you have a medium-sized database, you'll likely have to accept uncomfortably large (possibly even disastrous) data-loss window and a long downtime while you restore.

There's even cases where it can't work (1).

The free options from PG (namely things like barman and the built-in replication) is far superior to what the paid version of CockroachDB offers, let alone the community edition.

I wish they'd let you take a rocksdb checkpoint of a single node and restore that into another cluster. This should work if your replica == node count (a common setup). Getting access to the checkpoint isn't complicated, but recovering this with the leaseholders and cluster config baked into the database requires more insight into their abstractions than I have. Feels like something they need anyways, because, as-is, a permanent loss of 2 nodes is impossible to recover from (2)

(1) https://github.com/cockroachdb/cockroach/issues/28948

(2) https://github.com/cockroachdb/cockroach/issues/17186

Yeah, backup is missing. This question has been asked in the comments of every HN post about a new release.

It doesn’t make sense, they are missing out on lots of enthusiast/hacker adoption, people that can’t afford enterprise anyway.

Having backup as a “differentiating feature” for enterprise is such a stupid idea.

Well, and eventually sets you up for some bad press. I assume a "I chose CRDB, and now I've lost all my data" story will hit at some point. And it won't be obvious for all the readers that they weren't a paying customer.
I may be misremembering but I'm pretty sure you can backup like any postgres database with e.g. pg_dump

I think the premium distributed backup/restore thing is for backing up separate regional clusters individually

edit: looks like the premium option is a nice "BACKUP" command that handles uploading or downloading from cloud storage (e.g. s3) automatically. but for free you get "cockroach dump" which is similar to pg_dump

The enterprise license gets you incremental backups.

cockroach dump is similar to pg_dump, but it's worth pointing out that pg offers a lot more than just pg_dump. Things like pg_basebackup (and accompanying tools) and various replication strategies and capabilities (e.g. recovery_min_apply_delay) make pg a vastly safer option from a DR point of view.

Is that still open source? Didn't they change to the BSL?