Hacker News new | ask | show | jobs
by solatic 3176 days ago
> Assuming your database software

You're assuming that people are saving their state in databases to begin with. If you're saving state to a database in production, typically you're communicating with that database over a network connection, and not running the database on the same machine as your application. Containerizing databases is a whole separate issue.

OP's specific example is saving /var/opt/gitlab to an EBS volume and expecting to be able to move it from one spot instance to another without corruption. That strikes me as insane.

2 comments

What is so insane about this? It's no different than plugging in a USB drive, modifying some data on it, then disconnecting. Except in this case, the mount/unmount happens outside of the application's lifecycle so it can initialize and shutdown cleanly without worry.
Why? The gitlab init script to stop it is being run. It's a clean shutdown.
What happens if something causes it to hang? Presumably EC2 will time it out at some point.
And if GitLab (or whichever other application) is hanging and the stop script fails to cleanly shut down the application?

Shit happens at scale, it's precisely why ACID guarantees are important. Specifically in GitLab's case, because configuration is stored under /etc/gitlab, relying on EBS snapshots as a safeguard against corruption only works if the snapshot is taken of the entire FS, not just /var/opt/gitlab. If your machine is properly provisioned from an AMI or at least from some kind of configuration management, and you have some kind of reasonably-enforced policy which only permits changes through those management systems, then maybe you can get away with only taking a snapshot of /var/opt/gitlab, but now we're getting into the territory of "I understand how my data is being stored to the EBS volume (in this case, according to documented GitLab instructions) and I am acting accordingly". Then, if the /var/opt/gitlab snapshot ends up being corrupted, the odds of getting an uncorrupted snapshot increase with the more snapshots that you try, and this is probably good-enough in this specific instance because if you needed a better guarantee than that, you'd have a proper HA setup.