Hacker News new | ask | show | jobs
by oldgeezr 2928 days ago
Oh gee I guess I'm a wizard.

Lots of systems/embedded programmers roll their eyes at this kind of talk. Threads aren't really that hard.

Event queues do have benefits in certain situations. They pair nicely with state machines. You can easily end up in callback hell though, and it is often difficult to integrate some long-running, atomic tasks into your event loop. You end up doing things like having a thread pool, at which point you have to wonder why you stopped using threads in the first place. Oftentimes a threaded approach is a cleaner approach. Just get the locking granularity right - it's not that difficult.

7 comments

Systems/embedded programmers roll their eyes at this kind of talk because they usually control (or at least have visibility into) all of the code that goes into their stack. Threads aren't that hard under these conditions.

The main problem with threads is that they're non-composable: the set of locks that a thread holds is basically an implicit dynamically-scoped global variable that can affect the correctness of the program. If you call into an opaque third-party library, you have no idea what locks it may take. If it then invokes a callback into your own code, and you then call back into the library, there is a good chance that your callback will block on some lock that a framework thread holds, that framework thread will block on a lock you hold, and then the code that releases that lock will never execute. Deadlock.

If you control all of the code in your project, this does not affect you: define an order in which locks must be acquired and released and stick to it. If all of your dependencies have no shared data and never acquire locks themselves, this does not affect you (and indeed, this is recommended best practice for reusable libraries). If you never call back into third-party libraries from callbacks, this does not affect you, but it severely limits the set of programs you can write. If all of your dependencies thoroughly document the locks they take and in which order, this affects you but you can at least work around the problem areas and avoid surprise deadlocks.

Most application developers do not work under conditions where any of these are true, let alone all of them. Application development today largely consists of cobbling together third-party libraries and frameworks, many of which are undocumented, many of which are thread-unsafe, and many of which spawn their own threads and invoke callbacks on an arbitrary thread.

> the set of locks that a thread holds is basically an implicit dynamically-scoped global variable that can affect the correctness of the program

One technique to get a handle on this situation is making the mutexes actual explicit global variables.

"But global variables are bad" they will say. Yeah. And it reflects the reality.

"But I need a separate mutex for each object instance like they recommended in 1995 https://docs.oracle.com/javase/tutorial/essential/concurrenc... " they will say. Have fun with that.

Python and early Linux kernels use a single global mutex for access to all shared mutable state. In my experience, this is an entirely reasonable design decision for a huge majority of applications.

Well, you don't let lock semantics fall outside of a library frontiers. That means you do two things; first you do a global organization of the threads (no OOP-like patterns), second you export threads to the outside world in a hierarchy that exactly reflects the code hierarchy (easiest if you export a single thread).

There are some patterns that are safe as long as you implement them correctly. The patters that are good for IO are among the simplest, so that's where the GP was coming from. But it's not viable because he has full control of the code, it's viable because his problem domain has good options.

I agree with both of you. I don't think threads are THAT hard to work with. It definitely takes some experience to do it well and quite a bit more documentation to maintain the expected invariants. When libraries can get into a tangle, it's usually code that's in house and better ripped out. Easier said than done I know.

Open libraries tend to either just be single threaded abd should be used as such or explicitly thread-safe.

Disclaimer: Used threads in Java not much in C. Love me some Jsr-133 volatiles. Still confused with the Java 9 memory model updates.

Quite. I'm so fed up of the "threads are bad" argument (in my mind it's been commonplace since about 2008, so it's interesting to see this piece from 1995).

I've made use of threads at some point in almost every single job of any duration. They're one of many problem solving tools and if you understand them, which isn't particularly difficult, at some point you're bound to run into a problem that's a natural fit for a multi-threaded solution.

Nowadays, especially with no shared state, they're super-easy to use on many platforms. Take, for example, the parallel support in the .NET framework, along with functionality that supports debugging multi-threaded apps in Visual Studio like the ability to freeze threads.

If you do need to share state, which is when locking becomes essential, most languages and platforms have easy to use constructs to help you do this without much in the way of drama.

I'm not suggesting for a minute that there are no dangers, but there are plenty of dangers with other programming techniques, as well as lurking in any system of sufficient complexity, so I don't really understand why threads garner so much hate.

> which is when locking becomes essential, most languages and platforms have easy to use constructs to help you do this without much in the way of drama.

This is actually a problem. It is very easy to just slap locks around which, depending on your workload, can cause the threads to be blocked waiting for work.

I have seen many designs that used threads "for performance", but had so many locks in place that a single threads would actually perform similarly, with much less code complexity.

Once you get past a couple of locks in your code, it starts to smell.

Just because you can do Thread.New in your favorite language, doesn't mean you are using them correctly or efficiently.

It reminds me of a critique of threads in The Art of Unix Programming (available at http://www.catb.org/esr/writings/taoup/html/ch07s03.html#id2...). And now that I look it up, it actually cites the Ousterhout paper! This suspicion of threading was one of the few parts of that book I found unconvincing, personally, but it's another witness that they have worried some people for a long time.
A lot of work in programming languages over the past decade has been devoted to providing a safety net and guard rails for avoiding the pitfalls of thread-based concurrency. See in particular Rust and Go. It's still quite possible to corrupt data and get deadlocks, but our languages have come a long way to making it harder.

But the point of this article is to say if we ditch the notion of threads entirely and go with this other thing, we won't need safety nets anymore because it will be impossible to deadlock and corrupt data (as opposed to less likely).

I love go and goroutines, but besides the ability to select() over channels I wouldn't say go has done much to help get concurrency _right_. Mostly just easier. Even Java has a few more tools for healthy concurrency.

I don't blame go because I'm not convinced threads are all that bad, but having more concurrent data structures would be great.

Structured threads aren't that hard (e.g. task-based systems, thread pools).

Unmaintanable raw-pthread messes are a nightmare sequel from the director of Endless GOTOs.

Yes, small careful software teams can make threads work. However, if you start to work with physicists, mathematicians, electrical engineers, and so on who are incredibly smart in their own areas, but who don't have or even value a skill in software, you'll discover they make a real mess out of threaded programs in a way that doesn't happen with separate single-threaded processes.
If that's your audience, then you should give them a library/framework/language that hides all the complexity. I'm currently spending a lot of time working on Python Tornado stuff on an embedded device, and I can say that the lack of threads does not substantially reduce the number of ways you can screw things up.
They aren't my audience... they're my coworkers. Sometimes I get to pick how we do a project and sometimes I'm there to help them with their project. If they chose to use threads, I generally try to escape quickly.

No experience or comment on Python/Tornado. We don't really do a lot of web stuff.

But sure, people can screw things up in lots of ways. However, once a threaded program is screwed up, you really only fix it by starting a new version - it's near impossible to incrementally fix race conditions and dead locks - you can't reliably repeat the bug to debug it. Bugs in non-threaded code can at least be tracked down one by one.

There's even a course from GA Tech (Intro to Operating Systems, publicly available on Udacity) that covers how to use threads safely and sanely. I went in knowing nothing but terror from a failed experiment in naive multithreading and came out wanting to apply threads to everything. Maybe not quite the right approach, but I at least feel vastly more confident with keeping them manageable. Like you say, managing how and when to lock is the key.
> covers how to use threads safely and sanely.

It takes a lot less expertise to make events as fast as threads as it takes to make threads as safe as events. I don't know about the rest of you, but I personally do not have a brain that can become an expert on every topic.

If your state machine event handlers are non-blocking, then the thread pool is the same size as the number of available hyperthreads. That's not hard either. And, as observed elsewhere, it becomes impossible to screw up. That's a powerful property, and makes it possible for non-embedded 'normal' folks to write correct code in this space.