Hacker News new | ask | show | jobs
by vc8f6vVV 1017 days ago
> Why don’t people like async?

That's pretty simple. The primary goal of every software engineer is (or at least should be) ... no, not to learn a new cool technology, but to get the shit done. There are cases where async might be beneficial, but those cases are few and far in between. In all other cases a simple thread model, or even a single thread works just fine without incurring extra mental overhead. As professionals we need to think not only if some technology is fun, but how much it actually costs to our employer and about those who are going to maintain our "cool" code when we leave for better pastures. I know, I know, I sound like a grandpa (and I actually am).

5 comments

Show me how to cancel a network requests using only threads, with no access to the underlying socket APIs? Because that's trivial with `async`.

That's not "fun", that's table stakes.

You can cancel socket operations using signals. You can eg have one or more background threads running timers which will interrupt the blocking IO if it doesn’t return in a timely manner. A lot of very important frameworks and services that are used in billions of transactions per day use this model.
Of course you can. It does mean that you need cooperation between the child and parent thread (to set up the signal handler so that resources are cleaned up) though. That's easy in a framework, kind of a pain in the ass if you're just trying to get some opaque client you were passed to do something in <10 seconds.

And that's just for IO. I mentioned elsewhere that you may want to cancel pure compute work.

You can see my point, I assume, that when your userspace program can cancel tasks natively it's much easier to work with?

Can you cancel a tight computing loop (i.e. without system calls and without yielding of any sort) with async? I wonder how? Also if you can inject a cleanup code in your async task what prevents you from doing it with threads? Such things existed long before async/await and system calls didn't change for async/await. Also, what's the difference between "framework" and async/await runtime, isn't the latter a kind of a framework?
Without any yielding? Seems hard. You could park the thread idk.

> what's the difference between "framework" and async/await runtime,

Sure, in that in both cases you have the threads managed for you. But there's a difference between spawning a raw pthread, which will have no signal handlers/ cleanup hooks, and one managed by a framework where it can add all of those things and more.

Interesting, in the Java world Thread.stop is deprecated too: https://docs.oracle.com/javase/7/docs/technotes/guides/concu... Which means there is no good way to actually stop a thread involuntary. Of course in most simple apps it's not a big deal, but I would not do it in long-running apps.

OTOH in Rust async model is based on polling. Which means that poll may never block, but instead has to set a wake callback if no data is available. So there is no way to interrupt a rogue task and all async functions should rely on callbacks to wake them (welcome to Windows 3.1, only inside out!). Thread model is much more lax in this sense, e.g. even though my web server (akka-http) is based on futures, nothing prevents me from blocking inside my future, in most cases I can get away with it. As I understand it's not possible in Rust async model, I can only use non-blocking async functions inside async function. So in reality you don't interrupt or clean up anything in Rust when a timeout happens, you simply abandon execution (i.e. stop polling). I wonder what happens with resources if there were allocated.

The last comment is actually pretty interesting and spot on. In the Java/JDK world - which you can assume as a „framework“ - you can cancel blocking IO via the Thread.interrupt() mechanism. And that works because it’s deeply integrated into the framework, similar like async Rust runtimes provide support for cancellation.
> Show me how to cancel a network requests using only threads, with no access to the underlying socket APIs?

It’s been a long time since I did this in Rust. But why do you not have access to the sockets or at least a set_timeout method? Is it a higher level lib that omits such crucial features?

In Go, the super common net.Conn interface has deadline methods. Not everyone knows their importance but generally you have something like it piped through to the higher layers.

EDIT: Oh I see you replied to my other comment. Please disregard.

Total rust newb here, but does that need the full async story, or is it a limitation of an API somewhere? From the point of view of the code using the request's response could you use a channel with recv_timeout? Is the problem there that the thread with the socket connection is still going and there's no way to stop it?
The ability to cancel an operation without talking to the operating system requires that your program has yield points. That yielding is what allows another part of the program to take control and say "OK, I'm done with you now, no need to finish".

Yes, the problem is that your thread would continue to perform work even if you stopped waiting on it.

Maybe I don't understand the complexity, but in good old Ruby I can easily stop a thread if I don't need result anymore. No async needed and no yield points necessary. Doesn't it apply to Rust too?
I assume that Ruby does in fact have yield points in some form, such as a global lock. Killing a thread is only possible (for a pthread) via the `pthread_cancel` API. That API is very dangerous and is generally not something you'd ever want to use manually - the thread will not clean up any memory or other resources, any shared memory is left in a tricky state.

To gracefully shut a thread down you need yielding of some kind.

Most commercial code is running an almost entirely IO workload, acting as a gatekeeper to a database or processing user interactions - places where async shines.

Async isn't a lark, it's a workhorse. The goal is not to write sexy code, it's to achieve better utilization (which is to say, save money).

Depends on the nature of commercial code and if it has another level of parallelism (think of web servers and read my comment below). As for DB queries, here's the thing: most commercial code is using DB transactions and there is no way to run transaction across multiple connections, so you are either single-threaded and do things in sequence anyway (why use async then?), or you are multi-threaded and then forget about transactions. Besides that, even if you can get away with multiple transactions there are those pesky questions like "what to do with a partially failed state?". Not all transactions are idempotent, and not all are reversible, it's hard enough when you run them sequentially, and running them in parallel and dealing with a failure might be an absolute nightmare.
Most web applications (every one I've ever worked on) use connection pooling to run multiple transactions in parallel. I suppose you could think of that as a sort of network level parallelism, but it's not multithreading.

Connection pooling is of course not without it's hazards, scaling databases can be very difficult and almost all of the production incidents I've dealt with involve a database running out of a resource (often connections). But for your garden variety web app, it certainly isn't a dichotomy between serializing all concurrent updates or losing atomicity.

But async Python is a single threaded. I’d prefer async over multithreading in python nowadays. Otherwise code can be slow as piss, if it’s doing a lot of I/O. Then, async is almost table stakes for almost any level of reasonable performance (GIL and all).
Not exactly sure how async in Python works, but if its runtime is non-preemptive and single-threaded (i.e. based on yield), then congratulations, you reinvented Windows 3.1! Those who are old enough to be "lucky" to use it, remember that the damn thing could hang the whole OS if your application was careless enough to block and not yield. Also "slow" is relative, if you create a thread to do DB query, thread creation is a way faster than any DB request, so not sure why it's slow. Never had problems with Ruby threads even though Ruby doesn't have a mechanism to create a thread pool (didn't have? it's been some time since I worked with Ruby). Java & Scala, OTOH are using thread pools, even multiple variations of them, so the thread startup time doesn't matter. In any case you are talking about I/O, in which case neither thread startup nor context switching matters.
Another reason to is that it lets you handle bursty input with bursty CPU usage. Sounds great, right? Round peg, round hole.

But nobody will sell you just a CPU cycle. They come in bundles of varying size.

I recently heard a successful argument that we should take the pod that's 99% unutilized and double its CPU capacity so it can be 99.9% unutilized, that way we don't get paged when the data size spikes.

When I proposed we flatten those spikes since they're only 100ms wide it was sort down because "implementing a queueing architecture" wasn't worth the developer time.

I suppose you could call it a queueing architecture. I'd call it a for loop.

Your answer boils down to: "I know this technique, I don't want to learn blub technique. My job is to get stuff done, not learn new techniques." In which case, good for you; enjoy your sync code (seriously), and please stop telling the rest of us that have learnt the new blub technique that we shouldn't use it.
Nope, my answer boils down to: "I said I never had much use for one. Never said I didn't know how to use it." (c)
Honestly, your dismissal of its value sounds very much like you don't know how to use it. The whole argument can be turned around and the same said about threads, which are not "simple" as you suggest if you don't already know how to use them. You might as well say "simple async".
If you carefully read my message, I said there are cases where async is beneficial. Most of the time I don't think even threads are necessary. E.g. the most common application nowadays (arguably) is a web server. Of course those who write web server itself may use whatever technology that fits, but for us mortals who simply want to receive request, query DB and respond with data, even threads have a very limited usage. Why? Because web servers are highly parallel, you try to make your request processing parallel and you starve another request (DB is a limited resource, and most web apps don't require computational power). So a simple sync processing works just fine -- no headaches, no mental overhead and you can focus on the business logic, that's what your employer values the most. The exception is when your company name is Twitter or X, whatever (which the most of web apps are not). Other cases? Depends, but the same approach applies: we usually have a bottleneck somewhere else, so you are trying to be smart and starves that. And introducing a sophisticated approach where it's not necessary you shift the focus from the business logic (see above).
Also parallel execution is never simple, there are multiple problems no matter what technology you use, be it async or threads. Meanwhile there are different threads too, you know, green, system etc. There is Erlang for example, which existed long before async was invented. Async is just the current hype, which always starts with "we solved this specific problem, let's do it everywhere!", then ... yeah, we did, but only for this special case, but then it creates tons of problem elsewhere, but we are not going to look there, and if you are looking there we will declare you simply not able learn our new shiny thing. Been there, seen that. Even had this mentality.