|
|
|
|
|
by elijahbenizzy
697 days ago
|
|
Interesting. I don't disagree, in general, but I actually have worked with a lot of applications that like to do this. Specifically in the world of ML/AI inference there's a lot of moving between external querying of data (features) and internal/external querying of models. With recommendation systems it is often worse -- gather large data, run a computation on it, filter it, get a bulk API request, score it with a model, etc... This is exactly where I'd like to see it. I'd like to simultaneously: 1. Call out to external APIs and not run any overhead/complexity of creating/managing threads
2. Call out to a model on a CPU and not have it block the event loop (I want it to launch a new thread and have that be similar to me)
3. Call out to a model on a GPU, ditto And use the observed resource CPU/GPU usage to scale up nicely with an external horizontal scaling system. So it might be that the async API is a lot easier to use/ergonomic then threads. I'd be happy to handle thread-safety (say, annotating routines), but as you pointed out, there are underlying framework assumptions that make this complicated. The solution we always used is to separate out the CPU-bound components from the IO-bound components, even onto different servers or sidecar processes (which, effectively, turn CPU-bound into IO-bound operations). But if they could co-exist happily, I'd be very excited. Especially if they could use a similar API as async does. |
|