| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by orthecreedence 618 days ago

Two main reasons I can see:

Ops is easier, for the most part. Doing ops on an RDBMS correctly can be a pain. Things like replication, failover, performance tuning, etc etc can be hard. This is much less of an issue because services like RDS solve this and have solved it for a long time. Not a huge issue there.

Splitting compute from storage makes scaling a lot easier, especially when storage is an object store system where you don't have to worry about RAID, disk backups, etc etc. Especially for clustered systems like elasticsearch, having object store backing would be incredible: if you need to spin up/down a new server, instead of starting it, convincing it to download the portions of the indexes it's supposed to and waiting for everything to transfer, you just start it and let it run immediately. You can also now run 80% spot instances for your compute nodes because if one gets recalled, the replacement doesn't have to sync all its state from the other servers, it can just go to business as usual, and a sudden loss of 60% of your nodes doesn't mean data loss like it does if your nodes are holding all the state.

I think for something like an RDBMS, object-store backing is very likely completely overkill, unless you're hitting some scaling threshold that most of us don't deal with ever. For clustered DB systems (cassandra/scylla, ES, etc etc), splitting out storage makes cluster management, scalability, and resiliency worlds easier.