Hacker News new | ask | show | jobs
by brigadier132 832 days ago
Ok, what risk? Cockroachdb is already proven technology and costs marginally more (if you use their serverless setup, it's free until you hit real scale). At the startups I've been at that hit scale, scaling sql was always a massive undertaking and affected product development on every single time.

If you don't want downtime, don't use databases that require downtime to do a migration?

Netflix, roblox, every single online gambling website all use cockroachdb.

2 comments

Sounds like their discomfort was in the migration path to 'any other database' alongside not having the experience with another database to mitigate any unknown unknowns.

> During our evaluation, we explored CockroachDB, TiDB, Spanner, and Vitess. However, switching to any of these alternative databases would have required a complex data migration to ensure consistency and reliability across two different database stores.

> ensure consistency and reliability across two different database stores.

This is main known known. And this is hard thing to attain.

My favorite story on that is testing of tendermint consensus implementation [1]. The testing process found a way to break the consensus and the reason was that protocol implementation and KV store controlled by protocol used different databases.

[1] https://jepsen.io/analyses/tendermint-0-10-2

Never used cockroach so pardon my ignorance, but are there no operational challenges with running/using them? Or are they the same challenges? And how compatible is it from an application developer perspective?
The managed service is hassle free and it's auto sharded so you don't have traditional scaling issues. You do need to think about how your index choices spread writes and reads on the cluster to avoid hotspots. It's almost completely compatible with postgres wire protocol but it doesn't support things like extensions for the most part.
There are TONS of operational issues running cockroach. At the last company I was at cockroach was probably over used as a magical way to run multiple DCs and keep things consistent without high developer overhead, but it was #1 source of large outages. So much so that we’d run a cockroach segmented out for a single microservice to limit the blast radius when it eventually failed.

That and its comically more expensive than Postgres, if you think IOPs are expensive wait till you see the service contract.

CRDB is Postgres compliant so the wire protocol and SQL syntax is all Postgres. It should be a 1 to 1.
Are all the corresponding latencies for every query one to one too?
In ye olden times I used to stop bosses from throwing away the slowest machine we had, and try to get at least one faster machine.

It’s still somewhat the case, but at the time the world was rotten with concurrent code that only worked because an implicit invariant (almost) always held. One that was enforced by the relative time or latency involved with two competing tasks. Get new motherboards or storage or memory and that invariant goes from failing only when the exact right packet loss happens, to failing every day, or hour, or minute.

Yes, it’s a bug, but it wasn’t on your radar and the system was trucking along yesterday and now everything is on fire.

The people who know this think the parent is a very interesting question. The people who don’t, tend to think it’s a non sequitur.

This is the important question.

We evaluated several horizontally scalable DBs and Cockroach was by far the slowest for our access patterns.

Except for the un-implemented features which they might need.

It also uses serializable isolation and in their implementation reads are blocked by writes unlike in Postgres. Those are both significant changes that can have far reaching application impacts