| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kachapopopow 480 days ago
	Yep, ran into this way too many times. Performing concurrent operations on non thread-safe objects in java or generally in any language produces the most interesting bugs in the world.

5 comments

Espressosaurus 480 days ago

Which is why you manage atomic access to non-thread-safe objects yourself, or use a thread-safe version of them when using them across threads.

Multithreading errors are the worst to debug. In this case it's dead simple to identify at design time and warning flags should have gone up as soon as he started thinking about using any of the normal containers in a multithreaded environment.

link

BobaFloutist 480 days ago

Every time I think I'm sorta getting somewhere in my understanding of how to write code I see a comment like this that reminds me that the rabbithole is functionally infinite in both breadth and depth.

There's simply no straightforward default approach that won't have you running into and thinking through the most esoteric sounding problems. I guess that's half the fun!

link

mrkeen 480 days ago

It's not that bad. We just don't have the equivalent of GC for multi-threading yet, so the advice necessarily needs to be "just remember to take and release locks" (same as remembering to malloc and free).

Hopefully someone will invent something like STM [1] in the distant year of 2007 or so [2]. It has actual thread-safe data structures. Not just the current choice between wrong-answer-if-you-dont-lock and insane-crashing-if-you-dont-lock.

[1] https://www.adit.io/posts/2013-05-15-Locks,-Actors,-And-STM-...

[2] https://youtu.be/4caDLTfSa2Q?feature=shared

link

LegionMammal978 480 days ago

Rust takes pride in its 'fearless concurrency' (strict compile-time checks to ensure that locks or similar constructs are used for cross-thread data, alongside the usual channels and whatnot), while Go takes pride in its use of channels and goroutines for most tasks. Not everything is like the C/C++/C#/Java situation where synchronization constructs are divorced from the data they're responsible for.

link

neonsunset 480 days ago

Synchronization primitives in Go are just as divorced as elsewhere, sometimes even more so - it does have channels, but Goroutines cannot yield a value, forcing you to employ a separate storage location together with WaitGroup/Mutex/RWMutex (which, unlike Rust's RWLock, is separate too, although C# lets you model it to an extent). This results in community developing libraries like https://github.com/sourcegraph/conc which attempt to replicate Rust's Futures / C#'s Tasks.

link

jpc0 480 days ago

Writing to a channel of size 1 feels a lot like a yeild to me, you can even do it in a loop.

A task is an abstraction over those primatives in any language. To my knowledge TBB task graph abstract over a threadpool using exactly that concept.

From what I've seen swift is the only language that properly handles concurrency. I'm taking another crack at rust but the fact that everyone uses tokio for anything parallel makes me feel like the language doesn't have great support for concurrency, it just has decent typing which isn't a surpise to anyone.

link

ratorx 480 days ago

For C++, abseil’s thread annotations are quite nice for getting closer to the Rust style of locking. Of course, the Rust style is still much easier to understand and less manual.

link

gf000 480 days ago

None of them solve the problems associated with the general category of race conditions. You can trivially create live/dead locks with channel/message-passing, and rust only prevents data races, though ownership is definitely a step in the right direction.

(Well, go is not even memory safe under data races!)

Also, Java is one of the languages where you can just add `synchronized` as part of the method signature, and while this definitely doesn't solve the problem, I don't think "divorced from the data" is accurate.

link

mrkeen 478 days ago

Re: 'synchronized' and data. It is a good distinction to make because sync does indeed lock control, not data. With ACID transactions or STM, an atomic section will run as-if-sequentially, full stop, since the data is locked. With Java sync, you get 'no other thread is in these lines of code' and you have to hope that's enough for the system to run as-if-sequentially.

link

mrkeen 478 days ago

I'd love to get some examples of Rust's best-practice shared-mutable-state code. So far when I ask around here I get answers equivalent to "Rust guarantees that you aren't doing that."

link

DylanSp 480 days ago

It's not a perfect situation, but C# has some dedicated collection classes for concurrent use - https://learn.microsoft.com/en-us/dotnet/api/system.collecti.... There's still some footguns possible, but knowing "I should use these collections instead of the regular versions" is less error-prone than needing to take/release locks at every single use site.

link

sunshowers 480 days ago

Concurrent maps are generally worse in terms of being able to understand the system than either non-concurrent maps guarded by a lock, or a channel/actor model with single ownership. Data-parallel algorithms should also generally use map-reduce rather than writing into the same map concurrently.

I've written highly concurrent software with bog-standard hash maps plus channels. There are so many advantages to this style, such as events being linearized (and thus being easy to test against, log, etc).

link

stouset 480 days ago

> "just remember to take and release locks"

If only it were so easy.

link

sunshowers 480 days ago

STM is not going to ever be a production thing outside of purely functional languages.

link

saagarjha 480 days ago

That’s what everyone thought about affine types, too.

link

sunshowers 480 days ago

True! I've been following STM and HTM research work for a while, and it all seems quite niche unless all side effects are captured (which is something purely functional languages can do). There isn't a real path to scalability I think, which there was with affine types.

Optimistic concurrency in general is a useful design pattern in many cases, though.

link

sunshowers 480 days ago

The usual issue is code evolution over time, not the initial version which tends to be okay. You really want to have tooling strictly enforce invariants, and do so in a way that fails closed rather than open.

In other words, use Rust.

link

kachapopopow 480 days ago

Tell that to inexperienced developers or making a massive single-thread project have multi-threaded capabilities.

link

stuff4ben 480 days ago

I've been that developer making a single-threaded app multi-threaded. Best way to learn though!

link

baggy_trough 480 days ago

Multi-threading - ain't nobody got time for that.

link

mrkeen 480 days ago

Yeah, our software politely waits for one customer to finish up with their GETs and POSTs before moving onto the next customer.

We have almost one '9' of uptime!

link

baggy_trough 480 days ago

There are better ways than threading.

link

mrkeen 480 days ago

Yeah, like pretending you aren't

link

foobarian 480 days ago

I ran into my share of concurrency bugs, but one thing I could never intentionally trigger was any kind of inconsistency stemming from removing a "volatile" modifier from a mutable field in Java. Maybe the JVM I tried this with was just too awesome.

link

hashmash 480 days ago

Were you only testing on x86 or any other "total store order" architecture? If so, removing the volatile modifier has less of an impact.

link

bob1029 480 days ago

I've universally found that even when I am convinced that I am OK with the consequences of sharing something that isn't synchronized, the actual outcome is something I wasn't expecting.

link

loeg 480 days ago

The only things that should be shared without synchronization are readonly objects where the initialization is somehow externally serialized with accessors, and atomic scalars -- C++ std::atomic, Java has something similar, etc.

link

saagarjha 480 days ago

This is kind of a hot take but I actually prefer debugging races in C/C++ for this reason. Yes, the language prescribes insane semantics (basically none) when it happens, but in practice you’ll get memory corruption or other noisy issues pretty often, and the fact that races are mostly illegal means you can write something like thread sanitizer without needing source code changes to indicate semantics. Meanwhile in Java you’ll never have UB but often you’ll have two fields be subtly out of sync and it’s a lot harder to track this kind of thing down.

link

ivanjermakov 480 days ago

Some (maybe most?) operations on Java Collections perform integrity checks to warn about such issues, for example map throwing ConcurrentModificationException

link

kachapopopow 480 days ago

ConcurrentModificationException does not check threads, it triggers when it is already too late. It also triggers on the same thread if you remove while iterating an iterator

link

smarks 480 days ago

ConcurrentModificationException is typically thrown from an iterator when it detects that it’s been invalidated by a modification to the underlying collection. It’s harder to check for the case described in this article, which is about multiple threads calling put() concurrently on a non thread safe object.

link