I don’t know how anyone can afford these migrations especially for production on prem workloads without building literally duplicate sets of hardware clusters then manually migrate workloads.
We usually reuse the VMware hardware and (most importantly) file storage. Some additional hardware is required temporarily so you can build out initial Openshift nodes. The VMware nodes are decommissioned and converted to OSV nodes as the conversion goes along. With some kinds of file storage (cough NetApp) the conversion is zero copy, the VM literally stays where it is. With others we will copy to new NFS storage areas which will be provisioned on the same physical hardware.
It's a very scalable and almost fun task once you get into it.
Alternatives to VMware can run VMware VMs almost immediately, by translating the configuration and with only a few (or sometimes no) changes to the guest. Usually those changes are scriptable. I've done it a few times, moving between VMware and KVM of Windows guests pretty much just worked; the rest was optimisation, i.e. guest driver changes, etc.
Live migration is not realistic between different hypervisors, but a very short downtime per VM is realistic if the new hypervisor can adopt the old disk images directly, which some can. If you want, you can convert formats in the background while the VM is running on the new hypervisor. E.g. KVM and things built on KVM can do all these things.
So to each guest, it looks like a quick reboot with a quick hardware upgrade.
If that's coordinated properly, with a generic HA or Kubernetes setup, there's absolutely zero service downtime (if there are no serious mistakes), as it's just nodes within a cluster taken down one at a time while the others keep the services running, and state migrates among the nodes which are live.
Most of the things you'll change when migrating are the same for large numbers of VMs that are configured the same way except for their disk images, and easily minor things like MAC/IP. So after you've verified a small number, you can go right ahead and script the migrations for another thousand VMs, even doing them in parallel.
You don't need to migrate all VMs at the same time, and you shouldn't do that anyway. So the temporary hardware / cloud cost can be in the low single-digit percentage (for a few weeks to months at 40k VM scale, a few hours to days at 10 VM scale). You probably have some slack in there already, though, so might not need any additional hardware.
The 40k servers are probably made up of multiple redundant vSphere clusters with failover. You simply take one of those redundant clusters and migrate one half of it over. Then the other half. Then duplicate that process. As you build more compute in the new stack, you can decomission more and more of the old stack and convert it. The transition would progress like a cascade, with larger and larger groups of clusters being migrated at once until you're left with the one-off, ad-hoc, weirdo clusters at the end that need to be manually migrated (usually with great effort).
The actual hardware servers are clustered together into pools of resources. The pools are where the VMs live. The bigger the new pool becomes, the faster you can empty the old one. So the migration starts very slowly, ramps up quickly, and then tapers off.
> You simply take one of those redundant clusters and migrate one half of it over.
For that half you are migrating, you are essentially operating without redundancy. If these are serious production workloads, the tradeoff is not as simple as you make it seem.
The way a cluster works is you have a giant pool of resources. Say, 33 - 50% larger than the workload. The workload is a dozen VMs. The cluster is 8 giant compute servers and two giant storage servers acting as one giant compute and storage unit. For redundancy you have extra clusters laying around with no workload, but they are added as failovers.
Normally, if one server on a production cluster goes down, the other members of that cluster seamlessly will take over. This is where the extra capacity comes in. You don't migrate the workload to another cluster. You just lose overhead capacity. If you lose too much then you start migrating parts of the workload to the failover. Not the entire thing.
You usually don't have to use your redundant cluster at all until it's time to rebuild the failed cluster. You might pick one of these spare clusters you keep around for redundancy to migrate all or part of the production workload to while you fix the production cluster.
When doing a big migration you take a percentage of your redundancy and convert it to the new environment. This is your staging environment. Once it is capable of doing work, you slowly grow it out and shrink the old environment at the same time.
This is basically how HA works with VxRail. I buy more VxRail than I will actually host because if a node fails then the VMs can be moved - sometimes not always without downtime but no loss. If I run out of HA nodes or start running low on capacity, then Aria will start sending alerts.
Ha I have done migrations recently from vSphere to vSphere using vMotion and it was easy.
But it still took duplicate set of HW and I couldn’t imagine doing it without a lot of IaC and automation in place (plus physical space, power and cooling)
a) you migrate in increments, so even if your migration needs to run old and new to compare, you don't need to do it for everything at once.
b) you probably have some slack, and you can make slack by packing tighter during migration.
c) you probably have some amount of regular hardware refresh. Retaining the old hardware a bit longer can get you more headroom for migration.
d) some servers can probably take an extended maintenance outage during conversion.
e) depending on everything, you might be able to get short term capacity from cloud or short term leases.
There's almost certainly some automation around migration. Some of it might even work.
Have a plan, make progress... even if you don't migrate everything by the date, you'll have done a lot and reduce the broadcom bill.