|
Until I used Elixir, I thought workers/queues were enough. But after the last nearly-three-years, I've actually fallen into a place where workers/queues are almost always strictly inferior. Workers/queues in languages like Ruby have problems like, * Require very specific ergonomics(for example, don't hand the model over, hand over the ID so you can pull over the freshest version and not overwrite) * They require a separate storage system, like your DB, Redis, etc. This doesn't sound big, but when doing complex things it can turn into hell. * They have to be run in a separate process, which makes deployment more difficult. * They're slow. Almost all of them work on polling the receiving tables for work, which means you've got a lag time of 1-5 seconds per job. Furthermore, the worse your system load, the slower they go. * You can't reliably "resume" from going multi-process. Lets say you're fine with the user waiting 2-3 seconds to have a request finish. With workers/queues, you either have to poll to figure out when something finished(which is not only very slow, but error prone), or you have to just go slow and not multi-process, making it into a 8-10 second request even though you've got the processing power to go faster. So, you've got all that. Or in Elixir, for a simple case, you replace `Enum`(your generic collection functions) with `Flow` and suddenly the whole thing is parallel. I mean that pretty literally too- when I need free performance on collections, that's usually what I do. Works 95% of the time, and that other 5% is where you need really specific functionality anyway, and for those, Elixir still has the best solution to it I've ever seen. |
Shopify, for example, use Resque (Ruby + Redis) to process thousands of background jobs per second.
> * Require very specific ergonomics(for example, don't hand the model over, hand over the ID so you can pull over the freshest version and not overwrite)
This is good practice but certainly not a requirement. You can pass objects in a serialized format like JSON or use Protobuf etc.
> * They require a separate storage system, like your DB, Redis, etc. This doesn't sound big, but when doing complex things it can turn into hell.
ETS and Mnesia aren't production ready job queues, unfortunately: https://news.ycombinator.com/item?id=9828608
> * They have to be run in a separate process, which makes deployment more difficult.
Background tasks have different requirements so this is a good idea regardless.
> * They're slow. Almost all of them work on polling the receiving tables for work, which means you've got a lag time of 1-5 seconds per job. Furthermore, the worse your system load, the slower they go.
Redis queues have millisecond latency and there's no polling. Resque and Sidekiq use the BRPOP to wait for jobs. BRPOP is O(1), so it doesn't slow down as the queue backs up.
PG has LISTEN/NOTIFY to announce new jobs or the state change of an existing job so there's no need to poll. SKIP LOCKED also prevents performance degrading under load.
> * You can't reliably "resume" from going multi-process. Lets say you're fine with the user waiting 2-3 seconds to have a request finish. With workers/queues, you either have to poll to figure out when something finished(which is not only very slow, but error prone), or you have to just go slow and not multi-process, making it into a 8-10 second request even though you've got the processing power to go faster.
There are multiple other options here which are better:
Threads - GIL allows parallel IO anyway and JRuby has no GIL
Pub/Sub - Both Redis and PG have a great basic implementation usable from the Ruby clients
Websockets - Respond early and notify directly from the background jobs