| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ben509 2203 days ago

> Except IO is the bottleneck here.

If you say IO is the bottleneck, then you're claiming there is no significant difference between python and node. That's what a bottleneck means.

> The concurrency model for IO should determine overall speed.

"Speed" is meaningless, it's either latency or throughput. Yeah, yeah, sob in your pillow about how mean elites are, clean up your mascara, and learn the correct terminology.

We've already claimed the concurrency model is asynchronous IO for both python and node. Since they are both doing the same basic thing, setting up an event loop and polling the OS for responses, it's not an issue of which has a superior model.

> If python async is slower for IO tasks then sync then that IS an unexpected result and an indication of a python specific problem.

Both sync and async IO have their own implementations. If you read from a file synchronously, you're calling out to the OS and getting a result back with no interpreter involvement. This[2] is a simple single-threaded server in C. All it does is tell the kernel, "here's my IO, wake me up when it's done."

When you do async work, you have to schedule IO and then poll for it. This[1] is an example of doing that in epoll in straight C. Polling involves more calls into the kernel to tell it what events to look for, and then the application has to branch through different possible events.

And you can't avoid this if you want to manage IO asynchronously. If you use synchronous IO in threading or processes, you're still constructing threads or processes. (Which makes sense if you needed them anyway.)

So unless an interpreter builds its synchronous calls on top of async, sync necessarily has less involvement with both the kernel and interpreter.

The reason the interpreter matters is because the latency picture of async is very linear:

* event loop wakes up task * interpreter processes application code * application wants to open / read / write / etc * interpreter processes stdlib adding a new task * event loop wakes up IO task * interpreter processes stdlib checking on task * kernel actually checks on task

Since an event loop is a single-threaded operation, each one of these operations is sequential. Your maximum throughput, then, is limited by the interpreter being able to complete IO operations as fast as it is asked to initiate them.

I'm not familiar enough with it to be certain, but Node may do much of that work in entirely native code. Python is likely slow because it implements the event loop in python[3].

So, not only is Python's interpreter slower than Node's, but it's having to shuffle tasks in the interpreter. If Node is managing a single event loop all in low level code, that's less work it's doing, and even if it's not, Node can JIT-compile some or all of that interpreter work.

[1]: https://github.com/o0myself0o/epoll/blob/master/epoll.c

[2]: https://www.programminglogic.com/example-of-client-server-pr...

[3]: https://github.com/python/cpython/blob/3.8/Lib/asyncio/unix_...

1 comments

crimsonalucard1 2203 days ago

>If you say IO is the bottleneck, then you're claiming there is no significant difference between python and node. That's what a bottleneck means.

This is my claim that this SHOULD be what's happening under the obvious logic that tasks handled in parallel to IO should be faster then tasks handled sequentially and under the assumption that IO takes up way more time then local processing.

Like I said the fact that this is NOT happening within the python ecosystem and assuming the axioms above are true, then this indicates a flaw that is python specific.

>The reason the interpreter matters is because the latency picture of async is very linear:

I would say it shouldn't matter if done properly because the local latency picture should be a fraction of the time when compared to round trip travel time and database processing.

>Python is likely slow because it implements the event loop in python

Yeah, we're in agreement. I said it was a python specific problem.

If you take a single task in this benchmark for python. And the interpreter spends more time processing the task locally then the total Round trip travel time and database processing time... Then this means the database is faster than python. If database calls are faster then python then this is a python specific issue.