|
|
|
|
|
by kbenson
2792 days ago
|
|
> Also, having "clock slew" be a matter of perspective—with processes that can handle leap seconds seeing them happen instantaneously; and processes that can't handle leap-seconds, seeing slewed time—would be nice. I imagine there might be some really interesting (for meanings of interesting that include shoot me now) and hard to track down bugs as you deal with inconsistent clocks not just across systems within a network, but processes within a single system. |
|
I feel like the "safe assumption" that the other end of a given IPC channel (or even inter-thread communication channel) is on the same machine, is responsible for the vast majority of failures we see in e.g. Jepsen testing of databases.
After all, in sufficiently-large computers (i.e. HPC clusters that pretend to be one "computer"), you've got NUMA zones that are light-microseconds away from one another, where even threads of the same process can literally end up needing vector clocks to linearize events between themselves.
It probably wouldn't be too bad a thing if things like the Linux base-system used only internal IPC mechanisms that exposed this unreliability (like e.g. Erlang does with "unreliable async message passing" as its IPC primitive), forcing each component to deal with the fact that its peers may or may not be netsplit away from it.
Even if that scenario will only come up if you're writing code to get your GPS position from a Dyson sphere of 10-mile-deep Matryoska brains.