| Look, I obviously don't know the specifics but I think you swung too far on the other side and also just stitching together existing components. A single box running cron is quick and easy but I would be wary of hinging anything non-dev facing on that. * You can't fearlessly patch the cron box without knowing when the jobs run, I don't want any special cases. Also how would I even know when you have jobs scheduled looking at a fleet -- read your out of date docs? Ew no. So a messaging system, guess the devs were already familiar with Kafka, is necessary to process the jobs across multiple nodes. * Individual nodes are unreliable and you don't have any durable persistent storage. Most people don't like storing data in Kafka even though it's possible so they went with a database. * Cron doesn't have any mechanism to give you a history of jobs that isn't built into your script or parsing logs. Ditto with failure notifications. You also can't reprocess failed jobs except manually. Guess you just wait another 4 hours? * You now also can't duplicate the server because they're both going to try and do the export every 4 hours and step on one another. Woop, you made a system where an assumed safe operation "adding something" breaks stuff. This kind of thing is a nightmare if you already have queues and a database because why would you stand up another thing but if you had none of that to begin with then yeah... makes total sense. Like this is the reality of ops and running something "production ready" it's a lot of big ole complex HA platform so that you can run your 5 lines of code and not have to worry about any of the hard problems like availability, retrying, resource contention, timing, data loss, locking. |
My solution integrated with current tech, didn't have any of the problems outlined in those bullet points, and has required 0 maintenance and or updates for over 4 years. I don't want to even think about how many Kafka and couch releases there have been since then.