Hacker News new | ask | show | jobs
by HALtheWise 1871 days ago
The article ends with this nugget about C++ shared memory state management, but I don't have the background to know exactly what they're referring to. Does anyone recognize this pattern and feel able to explain it to mere mortals?

> “We also have different tools so that any state that is persisted through the application is managed in a very particular place in memory. This lets us know it is being properly shared between the computers. What you don’t want is a situation where one of the computers takes a radiation hit, a bit flips, and it’s not in a shared memory with the other computers, and it can kind of run off on its own.”

2 comments

Essentially:

They have one address range for shared (i.e. subject to syncing across all replicas) memory, and a separate one for non-shared (single-replica) memory.

Cross-replica data is presumably subject to their agreement algorithm, checking that the different computers reach the same (within some error bars) results; you want to arrange things so that there are frequent checkpoints at which the conflict resolution system can say "a bad write happened at this point, I should disregard whatever this computer said from that point until it recovers".

i.e. you want local memory to use as scratch space for performance reasons, but to make sure that there isn't a long runway for errors to happen and decisions to be made before the shared-memory checker notices a mistake. To ensure this happens, you want manual control over which memory allocator handles which data.

That makes sense. One part I'm still not clear on is how you accomplish a "restore" to fix the broken state of a process with a bitflip. Is it enough to simply copy all the shared state memory over as a block and jump into executing it? That seems like it would require the invariant that shared memory never references private memory, and I'm not sure how to statically enforce that.
"Restore" is reboot. Usually these are called "watchdog circuits", which you may have heard of from more mundane embedded applications.

Once you've rebooted, yeah, you need to copy over the shared state from another of the processes.

You probably have hardware that watches ECC flags. For a correctable one-bit flip, it triggers a read-and-then-write. For a two-bit flip, it might just kill and restart the process, or reset the whole machine. As long as it doesn't happen too often, it's fine: the whole system (constellation and ground nodes) are designed to accommodate such events.
This is very common when controlling equipment in C++.

Technically, all C++ programs control equipment. Differences include that one program may run for weeks or even years, and is mostly the only program running, or the only one on a core. This happens from microcontrollers on up to servers with a TB of RAM running, say, high-frequency trading, and in networks of hundreds of those, running weather simulations.

The program typically does all its heap allocation at startup. There is no reference-counting std::shared_ptr. You might have lots of std::vector<std::unique_ptr<T>>, std::string, the works, but they all get provisioned in the first second or two, and then just used thereafter. If anything goes wrong, you don't try to do anything clever or sophisticated; you just kill and restart, or even re-boot, and start over from scratch. That is fine if it doesn't happen too often, so you make sure it doesn't.

For communication between programs, some of the memory set up is shared, with a header containing std::atomic<std::uint64_t> sequence counters that each process can watch and compare against its last copy to know when something changed. Most commonly, actual messages show up on a ring buffer, so you don't need to act on them immediately; as long as you pick them up before they get lapped, you're good. If you get lapped, you might need to reset the whole system; so you make sure not to get lapped, by making the ring buffers big enough and by picking up messages soon enough. With big enough ring buffers and careful scheduling, you can leave all the bulk data there and just use it before it gets overwritten, avoiding expensive copies.

Often, once the program starts up, it does no more system calls at all, doing all its work by reading and writing shared memory, and maybe poking at hardware registers. On Linux one usually isolates cores doing this, with "isolcpus=..." on boot, and "nohz_full=...", "rcu_nocbs=...", "rcu_nocb_poll" etc. The ring buffers tend to live in hugepages ("hugepages=50000"), often just files opened in /dev/hugepages. This is all a simpler alternative to a unikernel/parakernel/demikernel/blatherkernel.

You might also have ephemeral processes that run just long enough to do a job and then quit, running on their own pool of cores and using their own pool of memory. This is usually how you administer the system: ssh in, look around, exit.