Hacker News new | ask | show | jobs
by retrogradeorbit 3643 days ago
My experience is that it is not "dandy". We have had a lot of trouble pegging the CPUs on our celery worker boxes (doing CPU bound jobs). You get more than one CPU utilised, sure, but we never can seem to get all cores fully utilised. We rewrote some of the tasks into a single multi-threaded JVM process pulling off the rabbit queue and they instantly and consistently pegged every CPU at 100%. I wish I knew how to get our celery worker farm to full utilisation because it would save us a fair bit of money.
1 comments

He's saying run 32 individual processes, not 32 threads within one process. Python's global interpreter lock will knobble you if you're using threads.
Celery has a worker pool of separate Python processes that jobs can be offloaded to. It side steps the GIL because it doesn't use threading.
Well two ways. Either launch 32 or 64 copies of the process using multiprocessing, or 1000 threads with geventlet.

I have never found a box I couldn't peg :)

I guess if you had 1 gb ram and 16 cores it would be challenging in python.. but a few gb of ram and we are on.

We are using processes in celery, not threads. Yet we still can't get to full utilisation. On an eight CPU machine we get around 650% - 750% CPU utilisation. It varies, it goes up and down in that range. We never get to 800%. The JVM version is just constantly 800%.