Not parent, but how about a load ballancer and VM's? =) Or, if you have 1 machine I believe you can do things in Linux / BSD to start a new binary without killing ongoing connections to the old process.
Edit: Oh, or you know, just don't? Maybe no-one will notice 5 secs of downtime :P depends ..
You can use SO_REUSEADDR and SO_REUSEPORT to bind multiple instances to the same port. This means you will be using the Linux network stack as load balancer though.
Even if you have only 1 machine, you could deploy a reverse proxy in front of it (think nginx, haproxy and co). Then you can start your application on two different endpoints, and instruct your load balancer to go from one to the other. Nginx supports live reloading of configuration, I think that should work.
For us, the approach is to use multiple instances of the monolith deployed simultaneously (listening on different ports) and to move traffic using a purpose-built software load balancer that is part of the same solution stack.
The load balancer software is an extremely simple concoction based upon the exact same primitives as our main application (AspNetCore). The only intelligence is in looking at request trace ids to determine old vs new routing.
A deployment can be resolved within a few seconds using this approach, even under heavy load.
We use a single VM per customer environment and are able to effectively realize zero downtime during business hours. We are granted maintenance windows after every business day and over weekends, so it is a little bit easier than if we were managing a credit card transaction processing system or similar.
To be fair, our approach only works because of how integrated we have our entire vertical. We went all-in on writing our own way to build, deploy and manage our own software. The persistence mechanism was developed with all of this in mind.
We are moving beyond all of this deployment magic though. There is a bold new configuration-driven realm that we are entering into which quickly begins to obviate the need for frequent & disruptive software deployments in the first place.
It might not necessarily, and I was genuinely wondering what patterns people use, but there are a few things about our monolith in particular that have seemed to make it harder for us:
- The monolith has to do all the work of the whole system when it starts, so start-up time is a lot longer, which makes rolling deployments much slower
- The monolith is much heavier in resource requirements, so it's much more expensive to spin up multiple of them
- The monolith has a much larger surface area, so post-deployment but pre-release verification is much more complex
- The migration path for a smaller service to support zero-downtime is simpler than for a larger service, so on average it's probably easier to get it working for an existing 'microservice' (although may be easier for the monolith than an equivalent full set of microservices)
Not a real issue at all in my experience. Businesses all over have done this for decades.
If you have many many microservices what is the combined resource usage vs. the monolith? Probably about the same and microservices may use more resources actually if we are talking anything with a runtime like Java.
Migrations in my experience mean migration of data so its about the volume of the data you need to change and not about the code operating against that data. Whether you have your monolith or a bunch of microservices waiting for the conversion job does not matter. It also does not matter whether your monolith or the microservices have to be able to read the data at any given time. With a quick starting microservice you may be able to say "eff it a transaction might fail but the new service will be up quick and the user can redo it" but that's not what you really want in big mission critical workloads. You really want to wait until all transactions on the old nodes finish before shutting them down. So your load balancer will route all traffic to the old nodes until new nodes are up, then shift traffic over and you kill your old nodes when all transactions have finished. Then you deploy those nodes and bring them back in. You have reduced redundancy and less capacity to handle load if we are talking old real iron businesses but in current cloud environments you just start up new instances and literally just kill old nodes after you've let the transactions finish. So you could even deploy during high load times.
I'm not talking hours of deployment here. Monolith doesn't mean you have to have all the cruft and startup times that say a jboss server with EJBs. You can have a monolith that starts up in a reasonable amount of time if you make the right choices for how you build your monolith.
Post deployment verification is slower why? If you do this manually why does it matter whether your changed customer flows are spread across 6 microservices that you just deployed new versions of or your monolith?
Also monolith doesn't necessarily mean you only have exactly one service. At my last place we had multiple services for various parts of the overall system but they were all monoliths themselves and shared some code through a library as well. But the services definitely weren't micro and did lots of things that weren't that related in the end. Could've split it up into many microservices.
The same you would use for a micro services setup except their is just one of them? Eg, AWS ECS is very straight forward to get this working for a monolithic setup. In the end you need the same patterns, health checks, load balancer, draining, etc regardless of monolith vs individual micro service.
The way heroku approaches this is by deploying the new version while the old version is running and then within a few minutes switch routing from the old version to the new version, and then shutting down the old version.
You deploy to the other one, do a health check, warm it up and start routing to it.
No downtime.