| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by purrcat259 1275 days ago

This actually happened to us last week in fact.

No deployments revealed how a legacy background processor started losing connections to the message queue and gets stuck in a state where it never reconnects.

Deployments always cycled the pods before the issue manifested.

1 comments

jldugger 1275 days ago

This is something a (now former) colleague of mine pointed out: that the kubernetes descheduler can enforce a maximum lifetime[0] that sort of forces continual reboots. So if your system cannot tolerate running for a long time continously, this is one method to gracefully restart long running pods.

[0]: https://github.com/kubernetes-sigs/descheduler#podlifetime

link