|
|
|
|
|
by tptacek
910 days ago
|
|
Interesting! We're mostly not kidding about that. We launched in 2020 with a scheduler that looks a lot like how K8s works†. We ran into scaling issues. Instead of scaling a globally coordinated "eye in the sky" scheduler, like Nomad and K8s offers, we relaxed a constraint ("when you ask to run a job, we'll move heaven and earth to put it somewhere") and wound up with a totally different scheduling model (a market-based system that bids on resources, where requests to place jobs are all effectively fill-or-kill limit orders). This was a bet. We're bullish about this bet! Even without K8s, having core scheduling be "less reliable" but with a simpler, more responsive interface puts us in a position to do some of the "move heaven and earth" work that K8s and Nomad do in simpler components (like: we can write Elixir code to drive the scheduler). But it might not pay off! That's what makes it a bet. † (see: comments on this thread asking why overengineered and wrote out own version of stuff; the expectation that you'd run a platform like Fly.io on standard K8s or Nomad is pretty strong!). |
|