Hacker News new | ask | show | jobs
by postpawl 2358 days ago
Exactly, and I’m surprised none of the other comments mentioned this so far. Often web app endpoints are bottlenecked by IO because they’re spending most of their time talking to a database or cache server. Python is probably not the right tool for a CPU intensive endpoint that needs to serve up hundreds of thousands of requests per minute and can’t be cached.

There are a lot of ways to handle the IO intensive scenario in python:

* Threading - Works with python libraries written in C, but now you need to add locks to your code to prevent race conditions. Not good for CPU heavy work because it’s switching context every ~100 instructions.

* Gevent - Requires minimal code changes, but it usually blocks when running code from python libraries written in C. This uses event loop style concurrency and automatically patches internal python libraries to switch context when IO occurs. This means less time spent unnecessarily switching context, so it can scale better than python’s threading.

* multiprocessing - Better for CPU intensive work, but requires more memory than other solutions. You don’t need to worry about race conditions since the processes are separate.

* asyncio - Requires code changes and using compatible libraries. It does event loop style concurrency that allows you to specify specifically when to switch context. Like with gevent, this means less time spent unnecessarily switching context.

* ...And there are probably a bunch of other ways I’m missing.

I’ve heard that this sort of stuff is simpler in Go.

2 comments

From what I’ve seen in production environments, Python has an uncanny ability to take tasks that should be IO- / server- bound and make them CPU-bound.
For sure, but often that’s a sign that you’re doing it wrong. Maybe those aggregations should be done in the database, maybe it should be cached, maybe you should do that in bulk with one request, maybe it should run in the background anyway, etc.
or maybe you should consider using C++, C#, Java, et al instead?
i.e. "maybe do it anywhere else than in Python"
Sometimes the language is the reason it is IO bound. A good example of this and discussion is in the link below. The heap of abstractions can cause it to be IO bound, or the CPU usage can bound the IO.

https://erikengbrecht.blogspot.com/2008/06/what-does-io-boun...