|
|
|
|
|
by ClifReeder
1657 days ago
|
|
Strong agree. We were using Fargate nodes in our us-east-1 EKS cluster and not all of our nodes dropped, but every coredns pod did. When they came back up their age was hours older than expected, so maybe a problem between Fargate and the scheduler rendered them “up” but unable to be reached? Either way, was surprising to us that already provisioned compute was impacted. |
|
Couldn't just delete the Fargate profile without a working EKS control plane. I lucked out in that the label selector the kube-dns Service used was disjoint from the one I'd set in the Fargate profile, so I just made a new "coredns-emergency" deployment and cluster networking came back. (cluster-autoscaler was moot since we couldn't launch instances anyway.)
I was hoping to see something about that in this announcement, since the loss of live pods is nasty. Not inclined to rely on Fargate going forward. It is curious that you saw those pod ages; maybe Fargate kubelets communicate with EKS over the AWS internal network?