Hacker News new | ask | show | jobs
by h0l0gr4ph1c 1919 days ago
> "Using a proper, single-threaded executor and running it on the current thread seems like it would work, yes. (To be fair, I also feel like just having the sync version of a Python API call "trio.run(self.equivalent_async_api)" would probably also work, and I don't totally follow why that's insufficient....)"

Why do I need all this other bloat just to run a function? Why can't I just run the function?

1 comments

I'm not sure exactly what you're asking, but I think it's either answered by the "What color is your function?" article I linked above, or by the answer that this is exactly why unasync exists and why I suggested that approach is worth considering, or by the answer that you can, in fact, just run the function and the "bloat" (which is just syntactic bloat - note that performance is generally going to be better!) is taken care of behind the scenes by a wrapper that calls an executor for you.
yes sorry, those were rhetorical questions. Your point about asking you fail to see why not using a blocking executor to deal with the async code. My problem is with needing the executor at all. I must have skipped a couple of you previous pints in this thread. Apologies about that...

Maybe we should start trying to think about async as being something can use if they want and ignore if they want. Code being async compatible rather than async required.

How would this work? (I do really think this is the right model, I'm just trying to figure out what that model is, exactly. :) )

Let's say I have code like this, in Python asyncio:

    class ShardedDBClient:
        async def query(self, key):
            tasks = [self.query_shard(key) for shard in self.shards]
            results = await asyncio.gather(*tasks)
            for partial_result in results:
                if key in partial_result:
                    return partial_result[key]
            return None
How do you run this without an executor?

The obvious way to make it not be "async required" is to say, we get rid of the async/await keywords - but what do you do with that "await asyncio.gather" instruction? Do you call each of those callbacks serially?

Generally, even in Rust (perhaps especially in Rust), I would expect this to use some OS facility for waiting on multiple sockets (possibly even just boring select(), but preferably epoll/kqueue) to send a bunch of database requests out in parallel and then wait on all their sockets to handle responses as they arrive. I would expect that even if my own code doesn't involve async/await at all.

The easy way to implement that is

        def sync_query(self, key):
            return asyncio.run(self.query(key))
which creates an asyncio executor just to run that one function.

This is going to be a lot faster than querying those shards one at a time! And it also can semantically change how the library behaves - imagine that there's a timeout parameter, and I set a 100ms timeout. I probably mean that to be 100ms for the entire operation, not 100ms per request, but I probably also don't expect my calls to always fail if each query takes 10ms and there are more than 10 shards.

The downside is that this library is quietly using asyncio without you knowing. But how exactly is that a downside? I already expect the library to be using select/epoll/kqueue without me knowing. And in a language like Rust, the executor should basically compile out - it should be a "zero-cost abstraction" compared to writing the event-handling code by hand.

As long as the async code doesn't depend on calling itself concurrently it should be straightforward to simply execute it on the current thread, right? (Basically using an executor that has 1 thread, the current thread.)

And it'd be great to check and optimize away all this at compile time.