| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by saxenaabhi 231 days ago
	But ephemeral and non-redundant. Am I correct in that using local disk on any VPS has durability concerns?

9 comments

sgarland 231 days ago

Yes, it’s the ephemerality that’s the biggest issue. Enterprise-grade SSDs are quite reliable, and typically have PLP so even in the event of a sudden power loss, any queued writes that the drive has accepted - and thus ack’d the fsync() - will be written. Presumably you’d be running some kind of redundancy, likely some flavor of RAID or zRAID (assuming purely local storage here, not a distributed system like Ceph, nor synchronous replication).

But in the cloud, if the physical server backing your instance dies, or even if someone accidentally issues a shutdown command, you don’t get that same drive back when the new instance comes up. So a problem that is normally solved by basic local redundancy suddenly becomes impossible, and thus you must either run synchronous or semi-sync replication (the latter is what PlanetScale Metal does), accepting the latency hit from distributed storage, or asynchronous replication and accept some amount of data loss, which is rarely acceptable.

samlambert 231 days ago

Agreed on these trade offs. We do both synchronous and semi-synchronous depending on Postgres or MySQL.

pas 230 days ago

... sounds like a trivial job for bare metal instances

and that EC2 local NVMe encryption keys are ephemeral is nice against leaks, but not a necessity for other clouds (and not great for resumability, which can really downgrade business continuity scores), and I expect for all the money they ask for it, to be able to keep it relatively secure even across reboots

BonoboIO 230 days ago

Or even a bare metal simple server that just does databases with redundant nvme ssd

fabian2k 231 days ago

Databases like Postgres have well established ways to handle that. And if you're setting up the DB yourself, you absolutely need to do backups anyway. And a replica on a different server.

saxenaabhi 229 days ago

Backups don't alleviate durability concerns. Read replicas(async) neither.

I think only way it could work was if I implemented sync replication like planetscale, but that arduous.

XCSme 231 days ago

On some providers (e.g. Hetzner), the dedicated servers come by default with 2x RAID 1 disks, so it's a lot less likely to fail (unless the datacenter burns down).

whizzter 231 days ago

You have a call from France, some company called OVH on the line!

BonoboIO 230 days ago

And your backup goes up in flames too.

I would never ever trust OVH with any important data or servers, I mean we saw how they secured their datacenters where it took 3h to cut the power while the datacenter was burning.

rcrowley 231 days ago

Yes, a single disk in a VPS or cloud provider has durability concerns. That's why EBS and products like it that pretend to be a single disk are actually several. Instead of relying on multiple block devices, though, we create that redundancy at a higher level by relying on multiple MySQL or Postgres servers for durability, each with a local NVMe drive for performance.

inapis 231 days ago

Sure. Till an extent. And if you run some mission-critical application, definitely.

But most applications run fine from local storage and can tolerate some downtime. They might even benefit from the improved performance. You can also fix the durability and disaster recovery concerns by setting up on RAID/ZFS and maintaining proper backups.

jascha_eng 231 days ago

yeh planetscale loves to flex how fast they are but the main reason they are fast is because they run a full abstraction less than any other cloud provider and this does in fact have trade-offs.

samlambert 231 days ago

What is wrong with running without lots of abstractions? We are clear about the downsides. The results are clear, you can see the customers love it. We run insane amounts of state safely on ephemeral compute. It's a flex. All I've seen from Timescale people is qqing. Write some code or be quiet.

jascha_eng 231 days ago

I'm not criticizing your engineering approach at all. Running everything in one box has its merits as your benchmarks show but it is also just not apples to apples there are other trade-offs and I am just appreciating that the community calls that out.

Also hey this is HN not Twitter I think we can be a bit more civilized. Not a good look imo for a CEO to get that upset over a harmless comment.

samlambert 230 days ago

We run 3 nodes not 1. Your comment is not in isolation we get constant shade from Timescale people when we don't even think about you.

CodesInChaos 231 days ago

Using a single disk has durability concerns. But I don't see why VPS vs dedicated server should matter much.

rcrowley 231 days ago

RAID isn't the answer, either, for the record. In AWS and GCP, the CPU or RAM blowing up will cost you access to that local NVMe drive, too, no matter how much RAID you throw at it.

samlambert 231 days ago

we have mitigated the durability concerns in multiple ways.