| I like the approach of the program managing its own scheduling much better than cron. You have an endpoint that is constantly available to scrape metrics from. You can send it an RPC to say "hey, do the sync now". The behavior when updating the code is well-defined. This particular daemon doesn't really do any of these things, but when they're needed it's trivial to add. I also don't think it's a bad idea to do the update at the desired interval. Someone could go into Cloudflare's web interface and mess up the configuration; with this design, the mistake is fixed within 30 seconds. With a design where an update request is only issued if an IP change is detected, then that will probably never be fixed and inconsistent behavior will result. The underlying problem is assuming that your sync program knows the entire state of the Universe and can properly detect a change. It can't and you shouldn't design something that can. All you want to do is push your known-good state to a configuration repository. So that's what the code does. Strange behavior works when a sync process does different things every iteration. If Cloudflare's API changes for some reason and your program only applies an update when there's a change means you only detect the failure when an IP address change is needed. Now your IP address is wrong in Cloudflare while you rewrite your program. If you just apply the change periodically even though there is no diff, you get a nice alert that the API is broken before you need to change. That increases the probability that you'll have a working program at the time when you actually need to change the IP. Instead of stressing out about the problem, you can take time to fix it properly, and no outage results. Furthermore, this program properly maintains a global rate limit of API calls across all configurations. That is way easier to do in a daemon (limiter.Take() whenever you feel like making an update) than in a cron job; which is going to have to make a different number of API calls each time it runs, and you don't control when the job runs. The rate limit is likely high enough that you will never notice, but technically this approach is more correct than the "yolo do whatever" cron job. Programs that are in a sleep(30 seconds) call do not use much memory or CPU. They can be trivially swapped out if the system is under memory pressure. A go application like this is going to use less than 10M of RAM all in. In 1962 that's a lot. Today that's nothing. Given the existence of a kubernetes configuration in the repository, I think they made the right decision with the design here. Kubernetes does have cron, but cron is a high-overhead thing in Kubernetes. It creates a new job and schedules the associated pod every time the scheduling interval is reached. That's one of the most expensive things you can do, and while no harm will be caused running this every 30 seconds, the maintainability is lower and the probability of something unusual happening is higher. Speaking from experience, "kubectl get events" also becomes completely unusable when you use cron to run frequent jobs. To be fair, I have no idea what sort of environment involves production traffic in a kubernetes cluster going to a dynamic IP address, and that sounds like something I'd look into before writing a sync script like this. But overall, I think the design is solid. It has the potential to be properly observable. It has consistent behavior. You can detect problems before they become an outage. It properly rate-limits calls to Cloudflare no matter how many records you have configured it to sync. That, to me sounds like a good design. I would personally have used a statefulset instead of a deployment. A deployment is designed to be 100% available. A statefulset is designed to have the exact number of replicas you specify running at once. For a sync job like this, you probably only ever want 1 running. When you upgrade the software, you don't want both the old version to be syncing stuff while the new version is waiting to pass its first health check. But in this case, I doubt anything bad will happen in that case, so it doesn't really matter. |
If you write your own scheduler, you have reinvented the wheel. Running a cron job is way more simpler and everybody is familiar with it. Nobody is familiar with your own custom scheduler.
> You have an endpoint that is constantly available to scrape metrics from.
I don't think you need metrics for this service at all. It has to "just work".
> You can send it an RPC to say "hey, do the sync now".
You don't need any of that. Keeping a service alive is unnecessary work, an RPC interface is overengineering. This is a very simple problem, don't overcomplicate it.
> Someone could go into CloudFlare's web interface and mess up the configuration; with this design, the mistake is fixed within 30 seconds.
My script have a --force switch, so you can do the same with it. Still, you shouldn't manage an IP manually when using a DDNS script. The cache is to "be nice" with CloudFlare (don't hit them unnecessarily) and your bandwidth. ddclient also had this feature.
> If you just apply the change periodically even though there is no diff, you get a nice alert that the API is broken before you need to change.
That's a valid point, I agree, but I don't think it's a big problem. REST APIs endpoints should never change, you should have years before CloudFlare deprecate and enpoint, so I think in the breaking API-regard, is a non-issue.
> Furthermore, this program properly maintains a global rate limit of API calls across all configurations.
I did not even think about rate limits, because my script will hit CloudFlare API so rarely, it should never reach rate limits. Also, if it would be needed, it could be implemented with the cache, which I have anyway. Also, you can side-step the issue by setting the time between updates bigger like minutes. Availability should be a non-issue, because if your service needs high availability, you should get a fix IP anyway.
> Programs that are in a sleep(30 seconds) call do not use much memory or CPU. They can be trivially swapped out if the system is under memory pressure. A go application like this is going to use less than 10M of RAM all in. In 1962 that's a lot. Today that's nothing.
Agree, but 10mb is still infinity times more than zero. :)
> Given the existence of a kubernetes configuration in the repository
A dynamic DNS script like this should be nowhere near at a Kubernetes cluster. If you think there is a use-case for Kubernetes with this script, you have huge problems anyway (e.g. not understanding what Kubernetes is and how should you operate it properly.)
> To be fair, I have no idea what sort of environment involves production traffic in a kubernetes cluster going to a dynamic IP address, and that sounds like something I'd look into before writing a sync script like this.
So we are on the same page about Kubernetes. :)