|
|
|
|
|
by asksol
3662 days ago
|
|
Author of Celery here. This is an interesting presentation! I want to clarify something about Celery and RAM usage. When writing web crawlers, and other (mostly) I/O-bound tasks, you should be using the eventlet/gevent execution pools instead of the multiprocessing one. This will drastically reduce memory use, and perform better. If you have four CPU cores you can start four worker instances with 1000 threads each (for a total of 4k threads):
`celery multi start 4 -A proj -P gevent -c 1000` This will utilize all the CPU/cores in your system, working around the GIL. One of the new features coming in Celery 4 is a message protocol with support for multiple languages, maybe we could have an Elixir worker soon. |
|
> The slides mention problems with Celery and RAM usage when writing crawlers, but since this is a mostly I/O-bound task you should be using the eventlet/gevent execution pools instead of the multiprocessing one.
Here: https://klibert.pl/statics/python-and-elixir/#/5/6 in line 18 you can see a `_do_some_real_processing_function`. The whole premise of the project is that this function is CPU-bound. My processor has four cores, so I create a pool of 4 Python processes (https://klibert.pl/statics/python-and-elixir/#/5/2 line 10).
Fetching pages is an IO-bound task, so it's done by Elixir. There we have a pool (for rate limiting) of 10 processes (the Erlang ones - important distinction) that do the downloading.
I think the closest analogy to what happens in this project is a Twisted (EDIT: or any other concurrent, but not parallel framework) app which uses a pool of processes for CPU-bound tasks. Here the Twisted part is replaced with Elixir.
EDIT: Also, we use Celery at work extensively and it works great and there is no real need to replace it with anything! Again, this project is just a tech demo, it doesn't make (much) sense on it's own. But there are other possible integration patterns where Elixir and Python have different roles, which actually do make sense. I think.