| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by t0mek 1440 days ago

This approach is very similar to the way how Kubernetes custom resource reconciliation works (and Kubernetes in general, but the custom resources is the way how you can bring your own logic there).

In Kubernetes you can define your own types, Custom Resources (basically JSONs with schema) and deploy "operators" - services that should handle these new types. Every time you create or modify your custom resource, the operator is triggered and it should "reconcile" your resource.

Now this reconciliation process is stateless. It doesn't know what exactly changed in your resource, so it should just go through the list of all the things that it needs to do (create or remove pods, services, configmaps, etc.) and if something is not right (e.g. a missing service), try to make it right or fail. In any case, the output should be written in the custom resource's .status section.

There's no active waiting - if the operator sees that some other resource is not ready yet (a required pod is still starting), it should just mark your resource as not ready and finish. If the pod state changes, the next reconciliation will notice it. It should do as much as it can to bring reality to the expectation, but not more.

If implemented correctly, this is surprisingly resilient. The idempotent nature of the reconciliation loop makes it perfect for errors handling. For instance, your reconciliation may fail because some pod is not running correctly. It's nothing that your operator can fix. But if the pod auto-heals (maybe the network connectivity was restored or an external service is available again), the operator will auto-heal as well, without a manual intervention. The next reconciliation loop will just see the pod is available again and carry on.