| We've [1] been using Hetzner's dedicated servers to provide Kubernetes clusters to our clients for a few years now. The performance is certainly excellent, we typically see request times half. And because the hardware is cheaper we can provide dedicated DevOps engineering time to each client. There are some caveats though: 1) A staging cluster for testing updates is really a must. YOLO-ing prod updates on a Sunday is no one's idea of fun. 2) Application level replication is king, followed by block-level replication (we use OpenEBS/Mayastor). After going through all the Postgres operators we found StackGres to (currently) be the best. 3) The Ansible playbooks are your assets. Once you have them down and well-commented for a given service then re-deploying that service in other cases (or again in the future) becomes straightforward. 4) If you can I'd recommend a dedicated 10G network to connect your servers. 1G just isn't quite enough when it comes to the combined load of prod traffic, plus image pulls, plus inter-service traffic. This also gives a 10x latency improvement over AWS intra-az. 5) If you want network redundancy you can create a 1G vSwitch (VLAN) on the 1G ports for internal use. Give each server a loopback IP, then use BGP to distribute routes (bird). 6) MinIO clusters (via the operator) are not that tricky to operate as long as you follow the well trodden path. This provides you with local high-bandwidth, low-latency object storage. 7) The initial investment to do this does take time. I'd put it at 2-4 months of undistracted skilled engineering time. 8) You can still push ancillary/annoying tasks off onto cloud providers (personally I'm a fan of CloudFlare for HTTP load balancing). [1]: https://lithus.eu |
Do you have to ask Hetzner nicely for this? They have a publicly documented 10G uplink option, but that is for external networking and IMHO heavily limited (20TB limit). For internal cluster IO 20TB could easily become a problem