My experience is that it is not "dandy". We have had a lot of trouble pegging the CPUs on our celery worker boxes (doing CPU bound jobs). You get more than one CPU utilised, sure, but we never can seem to get all cores fully utilised. We rewrote some of the tasks into a single multi-threaded JVM process pulling off the rabbit queue and they instantly and consistently pegged every CPU at 100%. I wish I knew how to get our celery worker farm to full utilisation because it would save us a fair bit of money.
We are using processes in celery, not threads. Yet we still can't get to full utilisation. On an eight CPU machine we get around 650% - 750% CPU utilisation. It varies, it goes up and down in that range. We never get to 800%. The JVM version is just constantly 800%.