| > they are going to need an infrastructure person No. I run multi-site Ceph+Nomad clusters with NixOS on Hetzner for our startup and maintaining those takes less than 5% of my time. By using great tools and understanding them well you can do it with little manpower. I learned all those tools in around 3 months total -- so around as much as getting a basic understanding of AWS IAM ;-) The only thing you don't get with that from your list is auto-scaling. But the with Hetzner the price difference vs AWS is 10x for storage, 20x for compute, and 10000x for traffic, so we just over-provision a little. And my 5% time /includes/ manual upscaling. Yes, I am oncall 24/7 to manage that infra, but I'd be as well when using hosted cloud services. Yes, fixing a Ceph issue, or handling Hashicorp Consul not handling an out-of-disk situation correctly is more complicated than waiting for S3 go come back from its outage, but the savings are massive. Testing whether your backup restore works is something you need to do equally with hosted services. So it is definitely possible to self-manage everything, for 5% of one engineer. |
“and understanding them well” is doing a lot of legwork there. From a standing start how does a startup that has the skills & experience to make the product but not necessarily manage the infrastructure get to the point of understanding the tools well, or even knowing which tools are best to learn to the point of understanding well?
> So it is definitely possible to self-manage everything, for 5% of one engineer.
I can accept that as true, if you have the right person/people, and they are willing (particularly the on-call part).