Hacker News new | ask | show | jobs
by nextaccountic 741 days ago
> Not sure how that's the same as "doesn't see a reason to solve data races". I see lots of reasons. I just think it is possible to achieve the security goals without it.

If Carbon doesn't prevent data races, then how exactly will it achieve memory safety? Will it implement something like OCaml's "Bounding Data Races in Space and Time"? [0]

If we ignore compiler optimizations, the problem with data races is that it may make you observe tearing (incomplete writes) and thus it's almost impossible to maintain safety invariants with them. But the job of a safe low level language is to give tools for the programmer to guarantee correctness of the unsafe parts. In the presence of data races, this is infeasible. So even if you find a way to ensure that data races aren't technically UB, data races happening in a low level language surely lead to UB elsewhere.

Ultimately this may end up showing as CVEs related to memory safety so I don't think you can achieve your security goals without preventing data races.

[0] https://kcsrk.info/papers/pldi18-memory.pdf

1 comments

It is possible to have a memory model that blocks word tearing without full logical data race prevention. Java does it, although it benefits from not having to deal with packed types etc.
I'm not sure, but I don't think this is the case. https://openjdk.org/projects/valhalla/design-notes/state-of-...

> Tearing

> For the primitive types longer than 32 bits (long and double), it is not guaranteed that reads and writes from different threads (without suitable coordination) are atomic with respect to each other. The result is that, if accessed under data race, a long or double field or array component can be seen to “tear”, where a read might see the low 32 bits of one write, and the high 32 bits of another. (Declaring the containing field volatile is sufficient to restore atomicity, as is properly coordinating with locks or other concurrency control.)

> This was a pragmatic tradeoff given the hardware of the time; the cost of atomicity on 1995 hardware would have been prohibitive, and problems only arise when the program already has data races — and most numeric code deals with thread-local data. Just like with the tradeoff of nulls vs. zeros, the design of primitives permits tearing as part of a tradeoff between performance and correctness, where primitives chose “as fast as possible” and objects chose more safety.

> Today’s JVMs give us atomic loads and stores of 64-bit primitives, because the hardware makes them cheap enough. But primitive classes bring us back to 1995; atomic loads and stores of larger-than-64-bit values are still expensive, leaving us with a choice of “make operations on primitives slower” or permitting tearing when accessed under race. For the new primitive types, we choose to mirror the behavior of the existing primitives.

> Just as with null vs. zero, this choice has to be made by the author of a class. For classes like Complex, all of whose bit patterns are valid, this is very much like the choice around long in 1995. For other classes that might have nontrivial representational invariants, the author may be better off declaring a value class, which offers tear-free access because loads and stores of references are atomic.

The key here is the last phrase: "For other classes that might have nontrivial representational invariants, the author may be better off declaring a value class, which offers tear-free access because loads and stores of references are atomic.". This implies that to avoid tearing you would need to introduce a runtime cost to every access, which is unacceptable for a language aiming to replace C++.

And you can assume that a low level language like Carbon has a lot of types with nontrivial invariants. Just like in Java, data races WILL make one thread observe a partially written value in another thread.

In the presence of data races, you can only avoid tearing when writing to fields whose size is smaller or equal than word length (typically, 64 bits). If all you have are small primitives or pointers, then it might work. But Carbon can't abide by this restriction either.