Hacker News new | ask | show | jobs
by Traubenfuchs 2233 days ago
> Programming with concurrency primitives is a difficult task because of the challenges created by its shared memory model.

I never understood this often repeated point. As junior / mid-level developer I had the privilege to run self written .jar files on government scale systems with more than 50 cores. I used Java thread pools and concurrent data structures to do heavy cross thread caching.

It was all pretty simple and concurrency & parallelism were never an issue but simply a necessity to make things run fast enough.

Am I a concurrent programming genius? Were the types of problems/challenges I was solving too simple? When is concurrency in Java ever hard+?

+ I know about Java masterpieces like the LMAX Disruptor that are mostly beyond my skill level, but those are low level writte-once libraries you wouldn't write yourself.

9 comments

> When is concurrency in Java ever hard?

Potentially-racey stuff:

* Synchronized primitives don't compose. You can safely `synchronized get(...)` and safely `synchronized put(...)`. But their composition put(get(...)+1) isn't synchronized. And it's hard to mentally revisit it at the end of the day: if you have a class with some methods marked synchronized, nothing will tell whether you've synchronized the right methods. You just have to think it through again and hope you reach the same conclusions as before.

Other (non-racey) stuff:

* Threads are heavy, CompletableFutures are light. But CFs lack the functionality of Threads. A CF can't decide to sleep for a while, nor can it be cancelled. (As an aside, BEAM threads are super light).

Java has a large set of higher level abstractions for concurrency. You don't have to use low level locks but you can. (And that's just Java, there's also Scala, clojure ...)
I'm pretty firmly in the "shared memory parallelism is a Good Thing" camp, but the counter argument to your point is that having a larger set of concurrency abstractions is a Bad Thing in that any particular piece of code has to consider all of the different permutations. In a shared-nothing world, there's a lot less to worry about (except occasionally performance).
The world writeable cross thread also has implications on how your GC algorithm is designed.

Erlang for instance scopes gc pools per process so short lived processes just drop the pool. Also GC of one worker doesn't stop any others. Can't remember if it even needs to be generational because the heaps are already sliced by process. It's the closest thing to heap arenas I've seen in a VM based language.

Or take Lua which is single threaded and doesn't require VM safepoints since everything is done via cooperative coroutines.

Java needs to assume worst case and as such has to be conservative in some of it's approaches.

As written here: https://github.com/l3nz/SlicedBread - "the over 400 rich pages of "Java concurrency in practice" show how hard it is to write and debug a good-mannered multithreaded application in standard Java."
... what are they?
https://docs.oracle.com/javase/8/docs/api/java/util/concurre...

And given that you didn't know that, you really need to study them.

java.util.concurrent is one of the greatest gems of software ever written.

So I describe the shortcomings of CompletableFutures, and you point out that there's a java.util.concurrent package?

Is this method one of its gems? A cancel() which doesn't cancel? https://docs.oracle.com/javase/8/docs/api/java/util/concurre...

> Synchronized primitives don't compose. You can safely `synchronized get(...)` and safely `synchronized put(...)`. But their composition put(get(...)+1) isn't synchronized.

So is there anything in java.util.concurrent that does compose? put(get()) has exactly the same problem up here at the 'high level' (of CountdownLatches and Semaphores) as it does at the 'low level' (of synchronized methods.)

Java does not allow one thread to kill another due to its shared memory concurrency model. It actually used to, but this feature was removed because it caused so many deadlocks. The reason is that killing threads won't always release monitors and locks. Lack of the feature is intentional.

You can get a lot better nonblocking support with third party libraries. Like RxJS in javascript, RxJava is almost a requirement when doing non-blocking code.

You are right that true green threads would allow thread cancellation one day. Right now, Java can't because it relies on OS threads which aren't safe to cancel. Userspace threads don't have that problem

What do you mean the cancel method doesn't cancel ? It does cancel with a CancellationException stored in the future and an optional interrupt send to the thread running the blocking operation.
original Futures can't be cancelled. CompleteableFuture can be, but only if the thread agrees to die.

You can achieve arbitrary non-blocking delays by using the cruft scheduled thread executor or doing it sanely with RxJava. Really its just dangerous to do nonblocking stuff in Java without a wrapper like RxJava. That's not a good thing, I look forward to the day there's real fibers

> > A CF can't decide to sleep for a while, nor can it be cancelled.

> How can't a CF be cancelled?

So glad you brought up the docs. CF implements cancel(boolean mayInterruptIfRunning)... which does nothing ;)

More precisely, it will not cancel a running CF. If the CF hasn't started yet, by all means cancel it. But if it's running, that cancel method does nothing.

I started to write a response but remembered rich hickey talk I went to where he lays out problems with java style concurrency

Clojure Concurrency - Rich Hickey https://www.youtube.com/watch?v=dGVqrGmwOAw

Even though the talk is called clojure concurrency, first half of the talk is about the problems clojure solving in traditional concurrency.

one my favorite talks I ever went to.

This version has the slides and video side by side: https://www.youtube.com/watch?v=nDAfZK8m5_8 might be easier to follow.
What are his thoughts on Erlang? (I have not finished the talk video yet.)
That it contains many good ideas but the lack of shared memory and the poor sequential performance leave a lot to be desired.
>It was all pretty simple and concurrency & parallelism were never an issue.

A lot of developers are not aware of what thread is going to execute their code, or of what that implies (I think it takes practice, at least it did for me), and in my experience it often leads to shared mutable state without proper guards, or deadlock hell from locks being created all over the place in hope to make things safe, or other nightmares.

>I know about Java masterpieces like the LMAX Disruptor that are mostly beyond my skill level

Both the basic idea of the Disruptor, and its simplest implementation (mono publisher, mono subscriber), are pretty simple: just using minimal memory barriers to write and read data cycling on an array, and (busy-)wait whenever you bump into whoever is ahead (the publisher if you're the subscriber, or the subscriber if you're the publisher).

Quoting one of its authors:

« Sometimes we have absolutely no choice and we need to go parallel and use a lot of concurrency. If you do, get people in who are good at it. And actually, I found most of the people who are really good at it, their instinct is they'll do it as an absolute last resort, because they know how complicated it actually gets. There is a scottish comedian called Billy Connolly [who said]: "people who want to own a gun, or be a politician, should be automatically barred from either of them." And I think it's the same with concurrency: anybody who just wants to do it should not be allowed. » (https://www.infoq.com/presentations/top-10-performance-myths)

>When is concurrency in Java ever hard+?

Take a look at dated, but still relevant book by Brian Goetz - Java Concurrency in Practice - many problems are illustrated with a code section.

"Concurrency primitives" here is probably referring to Java's fundamental mutex system as used with `synchronized`, `wait()`, and `notify()`. Java's thread pools and concurrent data structures are built on top of these and as you noted are relatively straightforward to use correctly, as they take care of the actual coordination of threads for you.
It's fine if you know what you're doing and are the primary maintainer. As someone who's encountered code maintained over a long period of time with lots of people coming and going, concurrency using locks is quite ugly, especially as people cargo cult "better performing" solutions that aren't actually & just add complexity/race conditions. My experience is primarily C/C++ but this is all agnostic to the language.
> this is all agnostic to the language.

yes and no. Yes - in the sense that pretty equivalent things can be done in different languages. No - i'm in the same group of "geniuses" as GP, and i see for example on our current huge C++ platform project the highly technical people struggle with and do the wrong things with concurrency/multithreading that i don't remember seeing the even mildly technical people doing on various large Java projects.

> Were the types of problems/challenges I was solving too simple?

Without more information this is the likely scenario, going by my own experience.

BTW, if it turns out you are a concurrent programming genius please write about it, eh? (Like a blog or book or something.)

It's simply FUD and a strawman. Designed to trump up support for the alternative. I mean the actor model is fine on itself, but people like to set up a strawman problem to talk about the perceived benefit over it.
You use concurrent data structures but didn't write them yourself correct? I would say that that isn't programming using concurrency primitives.