Hacker News new | ask | show | jobs
by denysvitali 125 days ago
Yes, wouldn't their fix likely make etcd not consistent anymore since there's no guarantee that the data was persisted on disk?
2 comments

Yes, but they wrote it’s for a demo and it’s fine if they lost the last few seconds in the event of unexpected system shutdown.

And also in prod, etcd recommends you run with SSDs to minimize variance of fsync/write latencies

Getting into an inconsistent state does not just mean “losing a few seconds”.
How would you get into an inconsistent state based on an fsync change?

Edit: I meant what sequence of events would cause etcd to go into an inconsistent state when fsync is working this way

data corruption, since fsync on the host is essentially a noop. The VM fs thinks data is persistent on disk, but it’s not - the pod running on the VM thinks the same …
Yes, they totally missed the point of the fsync...
well, the actual issue (IMHO) is that this meta-orchestrator (karmada) needs quorum even for a single node cluster.

The purpose of the demo wasn't to show consistency, but to describe the policy-driven decision/mechanism.

What hit us in the first place (and I think this is what we should fix) is the fact that a brand new nuc-like machine, with a relatively new software stack for spawning VMs (incus / ZFS etc.) behaves so bad it can produce such hiccups for disk IO access...