| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by achille-roussel 817 days ago

Distributed coroutines are a primitive to express transactional workflows that may last longer than the initial request/response that triggered it (think any form of async operation). While the distribution allows effective use of compute resources, capturing the state of coroutines and their progress is the key addition that enables the execution of workflows and guarantees completion.

A load balancer can help distribute new jobs across a fleet, but even the shortest of jobs can become "long running" when it hits timeouts, rate limits, and other transient errors. You quickly need a scheduler to effectively orchestrate the retries without DDoS-ing your systems, and need to keep track of the state to carry jobs to completion.

Combine a scheduler (like Dispatch) with a primitive like distributed coroutines, and you've got a powerful foundation to create distributed applications of all kinds without seeing complexity skyrocket.

1 comments

kgeist 817 days ago

OK, from what I understand, it's similar to what we do as well, except Dispatch adds magic while we do it all manually. We have an event-based system: instead of await points, we fire events which are stored inside an AMQP broker. The broker has N consumers on different nodes which take new jobs as they arrive. Retries/circuit breakers etc. are added manually (via a Go library), and if a job/event handler fails, it's readded back to the AMQP queue (someone else will process it later). Inside event handlers/job processors we also enjoy Go's builtin local scheduler (so I/O calls do not block entire cores).

I can see the benefit that with Dispatch, logic is simpler to read/to write as just ordinary functions, while in our approach, we have to scatter it around various event handlers/job processors. However, I still like that in our approach, event handlers/job processors are entirely stateless (the only state is jobs/event payloads), I've found it to be good for scalability and reliability + easier to reason about, compared to passing around internal coroutine state.

achille-roussel 817 days ago

Yes, that sounds very similar indeed. We've launched Dispatch because this is a universal problem that engineering teams end up having to reinvent over and over.

Dispatch can also handle the "one-off" jobs you describe, where you don't need to track the coroutine state. In a way, it's a subset/special case of the distributed coroutine (just like functions are a special case of coroutines with no yield point).