Hacker News new | ask | show | jobs
by tsukikage 948 days ago
If you are modifying TZ while another thread is relying on it to calculate time, those threads are racing, and hiding the crash won't solve the race: the reading thread will now randomly return values in the wrong timezone instead, subsequent code will use it in whatever operation it is it wanted the time for, the end result will be garbage, and this will be super hard to debug because there won't be a loud obvious crash pointing to the root cause and also depending on the winner of the race the symptoms will be random/intermittent.

Fix the high level race, and suddenly you no longer need the low level mutex.

2 comments

> the reading thread will now randomly return values in the wrong timezone instead, subsequent code will use it in whatever operation it is it wanted the time for, the end result will be garbage,

I really strongly disagree with how bad you seem to think this is. If you are designing your application to use the timezone and modify it at the same time, it is a totally natural consequence that you may see the previously set time zone in a timing dependent fashion. That's the nature of the beast. To "solve this" is seemingly to make that other thread capable of time travel or something. It read something before it was written, and acted on it. Reasonable!

The harmful data races are when you read intermediate results. If setting the timezone is a multi-step process, or involves manipulation on complex data structures with pointers that might be deallocated, then you are in grave danger. Seeing a previously valid result is ... I honestly don't know how you'd expect to solve it without threads being able to see the future, or some other unreasonable expectation.

> If you are modifying TZ while another thread is relying on it to calculate time

environ is a single contiguous null-terminated segment of null-terminated key-value pairs; any change of any environment variable might reallocate it, changing the address and invalidating the old address.

Also why it's a bad idea to store the pointer returned by getenv, it might be invalidated by any environment modification.

The strings in environ is only contiguous at program start. In every libc I'm aware of, both putenv and setenv replace only the specified key-value pair (and possibly environ itself, if it needs to be larger) and should not affect the address of any other environment variables. It is still thread-unsafe, but far more limited in its unsafety.
In current glibc master, it's unsafe for any putenv/setenv to race with any getenv, even if the variable names are different, for two reasons. (Note that multiple calls to putenv/setenv are serialized by a lock, but getenv does not take the lock.)

(1) setenv resizes environ using realloc, which frees the old buffer, so getenv can end up reading from a freed array.

(2) The code does not use atomics or memory barriers, so on weakly ordered architectures, getenv could observe another thread's write to one of the pointers in the environ array, or to the environ pointer itself, while observing stale values for the memory behind it.

In both cases, getenv could end up returning a bogus pointer or just crashing.

However, those issues can be fixed without changing the API, and at least Apple's libc seems to do the right thing here. On the other hand, other libcs such as musl, FreeBSD libc, and even OpenBSD libc (!) do worse than glibc and have no locking at all.

If someone could convince the maintainers of all those libcs to add a lock and make getenv/setenv 'thread safe as long as you're not racing on the same variable name', then that would be a good starting point. But in my opinion it would still be a half-measure. We need a fully thread-safe environment.

And honestly, it might be easier to convince the maintainers to add a full solution than a half-measure, even if it involved API changes. (But it may be hard either way. Rich Felker showed up in a Rust thread a while back and was highly negative on the idea of making any changes to musl.)

IMHO - I am sympathetic to the BSDs, Apple (presumably forked BSD), and musl approach.

In what sane world would someone reasonable treat (initial shell) Environment Variables as a proper ACID complaint database? About the 'best' solution I can see for preventing segmentation faults related to resizing the env array during runtime is to defer reclaiming freed memory chunks until after all in-process threads have been given another uninterrupted timeslice to process. Even that wouldn't be 100% but probably would cover any not pathological case.

atomic ordering is very easy if you don't care about performance. So on the other hand we could ask why get/put/setenv have such a terrible need for performance that we can't afford to put a simple lock around them.