| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by denysvitali 125 days ago
	Yes, wouldn't their fix likely make etcd not consistent anymore since there's no guarantee that the data was persisted on disk?

2 comments

weiliddat 125 days ago

Yes, but they wrote it’s for a demo and it’s fine if they lost the last few seconds in the event of unexpected system shutdown.

And also in prod, etcd recommends you run with SSDs to minimize variance of fsync/write latencies

link

ahoka 125 days ago

Getting into an inconsistent state does not just mean “losing a few seconds”.

link

weiliddat 125 days ago

How would you get into an inconsistent state based on an fsync change?

Edit: I meant what sequence of events would cause etcd to go into an inconsistent state when fsync is working this way

link

_ananos_ 124 days ago

data corruption, since fsync on the host is essentially a noop. The VM fs thinks data is persistent on disk, but it’s not - the pod running on the VM thinks the same …

link

justincormack 125 days ago

Yes, they totally missed the point of the fsync...

link

_ananos_ 125 days ago

well, the actual issue (IMHO) is that this meta-orchestrator (karmada) needs quorum even for a single node cluster.

The purpose of the demo wasn't to show consistency, but to describe the policy-driven decision/mechanism.

What hit us in the first place (and I think this is what we should fix) is the fact that a brand new nuc-like machine, with a relatively new software stack for spawning VMs (incus / ZFS etc.) behaves so bad it can produce such hiccups for disk IO access...

link