Hacker News new | ask | show | jobs
by jart 622 days ago
nsync has wrapper macros for all the various atomics libraries that prevented it from using two things.

1. Weak CAS. nsync always uses strong CAS upstream to make the portability abstraction simpler. Being able to use weak CAS when appropriate helps avoid code being generated for an additional loop.

2. Updating the &expected parameter. nsync upstream always manually does another relaxed load when a CAS fails. This isn't necessary with the C11 atomics API, because it gives you a relaxed load of the expected value for free when it fails.

Being able to exploit those two features resulted in a considerable improvement in nsync's mu_test.c benchmark for the contended mutex case, which I measured on RPI5.