Hacker News new | ask | show | jobs
by scottmas 1075 days ago
No one is talking about redundancy though. I love setups like this but prod environments need robust forms of redundancy. Cloud run, k8s, and their ilk are extremely distasteful I’ll grant you (the added complexity and cost almost never are worth it. And don’t get me started on the painfully slow prod debug cycles…) but the redundancy and uptime of them just can’t be beat with a setup like this.

Also, none of the solutions discussed here gracefully handle new connections on the new service while waiting for all connections to terminate before shutting down the old service. Maybe some of the more esoteric Ansible do idk.

I TRULY want the simplicity of setups like discussed here, but I can’t help but think it’s irresponsible to recommend them in non hobbyist scenarios.

2 comments

You have to decide whether the complexity and cost of a fully redundant system is worth it and consider it against what your SLA is, especially if your redundancy increases the risk of something going wrong because of that extra complexity.

From personal experience in B2B web apps, a lot of sales/business MBA type's will say they need 100% uptime, but what they actually mean is it needs to be available whenever their customer's users want to access it, and their users are business users that work 9-5 so there's plenty of scope for the system to be down (either due to genuine outage or maintenance/upgrades).

You've possibly also got the bonus of the people that use the app are different to the people that pay for it, so you've also got some leeway in that your system can blip for a minute and have requests fail (as long as there's no data loss), and that won't get reported up the management chain of the customer, because hitting F5 30 seconds later springs it back into life and so they carry on with their day without bother firing an email off or walking over to their bosses desk to complain the website was broken for a second.

At a previous company we deployed each customer on their own VM in either AWS or Azure, with the app and database deployed. It was pretty rare for a VM to fail, and when it did the cloud provider automatically reprovisioned it on new hardware, so as long as you configure your startup scripts correctly and they work quickly then you might be down for a few minutes. It was incredibly rare for an engineer to have to manually intervene, but because our setup was very simple we could nuke a VM, spin up another one and deploy the software back onto it in and be up and running again in under 30 minutes, which to us was worth the reduced costs.

> No one is talking about redundancy though. I love setups like this but prod environments need robust forms of redundancy

Not really, there are many kinds of apps that don't need such redundancy.

> Also, none of the solutions discussed here gracefully handle new connections on the new service while waiting for all connections to terminate before shutting down the old service. Maybe some of the more esoteric Ansible do idk.

I have dealt with this in the code with shutdown hooks on the server, waiting for existing requests to finish its processing and reject new requests, clients will just end up retrying, not all apps can accept this but many can.