|
> checkpointing the state to a file on the node, and loading that into your newly started pods For some types of my nodes, the majority of the state was tcp connections and associated processes. I don't think there's a generally available system that is capable of transferring tcp connection state (although, I'd love to build one! if you've got a need, funding, and a flexible time table), which would be a prerequisite to moving the process state. All of those connections need to be ended, and clients reconnect to another server (where they might need to do it again, if they don't get lucky and get a new server to begin with). The other nodes with more traditional state had up to half a terrabyte of state in memory, and potentially more on disk, and a good deal of writes. That's seven minutes to transfer state on 10G ethernet, assuming you can use the whole bandwidth and sending and receiving that data is faster than the network. Although, in my experience, we didn't tend to explicitly replicate disk based storage for new nodes, all of our disk based data was transient, so replacing nodes meant writing to new nodes, read from new and old, and retiring the old nodes when their data had all been fetched and deleted, or the data retention cap was missed. I/O Volume meant networked filesystems would be a big stretch. You could probably do something with dual-ported SAS drives, and redundant pairs of machines on the same rack, but then that pair with both go down when that rack has an unforseen problem, plus good luck getting dual-ported SAS drives hooked up properly when you're in someone else's bare metal managed hosting. (Yeah OK, maybe we had big performance requirements, but hotloading works just as well for stuff that fits on a single redundant pair, or even a single server in a pinch) |