|
|
|
|
|
by kgeist
1184 days ago
|
|
There are several reliability issues: 1) a single panic/exception/segfault in the executable brings down the whole website and so it will be unavailable until the executable restarts
2) entropy *always* increases (RAM usage, memory corruption, hardware issues, OS misconfiguration etc.) so eventually the application will break and stop serving traffic until it's repaired/restarted (which can take time if it's a hardware issue)
3) deployments are tricky if there's nothing before the executable (stop, update, restart => downtime)
4) if cache is in-process, on a restart it will have to be repopulated from scratch, leading to temporary slowdowns (+ and maybe a thundering herd problem) which will happen *every time* you deploy an update
I think much of it is ignoreable if the site is just a personal blog or a static site. But if the site is a real time "web application" which people rely on for work, then you still need: 1) some kind of containerization, to deal with inevitable entropy (when a container is restarted, everything is back to the initial clean state)
2) at least two instances of the application: one instance crashes => the second one picks up traffic; or during rolling updates: while one instance is being killed and replaced with a new version, traffic is routed to another instance
3) persistent data (and sometimes caches) need to be replicated (and backed up) -- we've had many hardware issues corrupting DBs
4) automatic failover to a different machine in case the machine is dead beyond repair
>not some external monster tool like k8sWhat can you use instead of k8s for this kind of scenario? (an ultra reliable setup which doesn't need a whole cluster) |
|
You don't need an "ultra reliable setup" or even a "cluster". You can have one nginx as a load balancer pointing at your unicorn/gunicorn/go thing, it's very unlikely to ever go down. You can run a cronjob with pgdump and rsync, in an off chance your server dies irrecoverably corrupting the DB (which is really unlikely for Postgres), chances are your business will survive fifteen minutes old database.
Most "realtime web applications" are not aerospace, even though we like to pretend that's what we work on. It's an interesting confluence of engineering hubris and managerial FOMO that got us here.