Hacker News new | ask | show | jobs
by jmillikin 3097 days ago
Hello! I work at Stripe and helped with some aspects of the Kubernetes cron stuff -- maybe these answers can be helpful.

  > Why wasn't the final sentence "and to re-evaluate if
  > moving forward was even a good idea?"
I think that's sort of implied -- complex technical projects have a risk of unexpected roadblocks, and it's important that "stop and roll back" always be on the list of options. Never burn your ships.

We invested a (proportionally) large amount of engineering effort to ensure we had the ability to move the whole shebang back to Chronos ~immediately. As noted in the article, we exercised this rollback feature several times when particular cronjobs deviated from expected behavior when run in Kubernetes.

  > Because I get nervous every time someone is relying on
  > their patches to be included upstream. Or they need to
  > dive in to the internals of something repeatedly. That
  > screams "not production ready" to me.
This is the same basic model as disto-specific patches to the Linux kernel.

Every engineering organization reaches the point where they want more features than are available in an existing platform. The most practical solutions for this are to launch a new platform ("Not Invented Here"), or contribute code upstream. The first option can provide better short-term outcomes, but is usually inferior on multi-year timescales.

Consider that with a mature build infrastructure, internal builds are actually the latest stable release plus cherry-picked patches. This provides the best of all worlds -- an upstream foundation, with bug fixes on our schedule, and an eventually-consistent contribution to the community.