| HN Mirror

You really don't need all 10 people on-call to know k8s to that level. They just need to know enough as to when to wake someone else up.

Everywhere I have worked where we have run clusters in the 100s to 1000s of nodes we have rarely had a team larger than 4-5 of true k8s folks and even then it's been a split between folks that are very hardware provisioning/network/etc focused and more higher level k8s folk which also take on a large portion of CI/CD work also.

At smaller scale (in the $1M/yr ballpark) I have done all the k8s bare metal ops myself along with all CI/CD and been responsible for a ton of the backend programming too. This is feasible because with distros like Talos etc it doesn't take a lot of manpower once it's setup and upgrades aren't too painful at small scale if you aren't running stateful services.

So tbh no, you just need ideally 2 folks at around ~200k/yr each that are competent and have done it before. The rest of the folks on the on-call rotation are just the rest of your engineers (and if you are at $1m/yr cloud spend you have more than 10 of those).