Hacker News new | ask | show | jobs
by berkes 1641 days ago
When can data races across processes happen?

Are you talking about databases, services or IO and such?

3 comments

I guess the simplest example is shared memory between processes.

Even Python has it: https://docs.python.org/3/library/multiprocessing.shared_mem...

Access to raw memory is locked behind the unsafe keyword though. Rust officially already does not guarantee any safety in that scenario even within 1 process.
There is always unsafe at some level on the standard library.

The point is that it doesn't protect the user of a crate that only exposes a fully safe API, unless they do digging to validate overall architecture safety.

>Even Python

It's not "even". Python specifically has it because it has no real threading.

Python does have real threading. The `threading` module provides os-level threads and synchronization primitives. The only difference between this and multithreading in C or Java is that CPython's GIL prevents more than one thread executing bytecode at a time. This prevents parallelism, but not concurrency.

Note this does not mean that python code is thread-safe by default. At most, you can theoretically rely on bytecode operations to be atomic, which means you'll need to synchronize multi-threaded code with mutexes, semaphores and higher-level synchronization constructs.

Python has cooperative threading. It's the same threading model used in the Erlang VM, Julia and many other dynamically typed languages. But preemptive threading vs. cooperative threading is orthogonal to whether data races can happen. Java threads are preemptive but data races can still happen.
The Erlang VM does preemptive scheduling.
No it doesn't.
While this is technically true it's quite misleading. The VM itself uses cooperative scheduling, but the Erlang compiler emits something akin to yields appropriately, such the net effect is preemptive scheduling. You can break it by calling a NIF that doesn't do the yields appropriately, but that's not the norm.
SharedMemory is a new thing in Python. Not even supported by all 3.x versions.
But this is specifically about Rust.

What data races between processes, other than Disk/IO, databases, or external services, can a Rust program have?

I explicitely exclude the whole category of external services, since that is "by design" really. And the whole reason for ACID, global mutexes, transactions and CRDTs.

That and kind of data structure that can be shared via IPC mechanisms, some of them even transparent for the processes.
Environment variables.

Locales.

Quite a few other POSIX bits, really.

It is not possible to have a data race with environment variables across multiple processes. Every process has its own copy of environment variables (in fact they have their own copy of the entire environment).

I'm not sure what data race is possible across processes with locales, that's too vague of a claim to make.

One type of locales I know are the LC_ env vars. So there the "ENV is a copy" applies too.

Another would be to read and write into locale files, such as JSON. But then the ame applies as with any database or IO: this is inherently race-condition-prone and that is by design.

Maybe grandparent is thinking about locales in many web frameworks, that is some global var which should not be shared across users. So that if you set `Locale.current = "EN_GB"` that applies for any (email)notifications, errors, files, responses or such, being sent out during that request/response and during any jobs that request/response may spawn. In e.g. Rails this "somewhat global var" is a Frankenstein, but works suprisingly stable, actually.