Hacker News new | ask | show | jobs
by ajross 3362 days ago
> Imagine two cores both trying to get a lock. The lock is held by a cache line in one of the cores. If both cores try to get the lock very often, the core further away may very well never get the lock.

If that were true, then your problem is L2-cache-bound and trying to fit your solution into SMP is the Wrong Thing. In fact, the single-threaded behavior you end up with is going to be faster (by definition!) than the "fair" architecture you seem to want.

No one serious thinks this behavior of the default locking primitive is a good thing. Maybe (maybe) it's a good fit for some particular problem somewhere, but I'd want to see a benchmark. It's definitely not a consensus opinion among people who do serious thought about synchronization.

1 comments

> No one serious thinks this behavior of the default locking primitive is a good thing.

Clearly some people do or OS X wouldn't do it.

Would it be the first badly implemented thing in OS X?
Mutexes are heavily used and get a lot of scrutiny, and work was done to speed them up when 10.11 (or was it 10.10? Whichever one added QoS) came out. I'm positive the people who maintain it are aware of the tradeoffs between fair and unfair locks and made the default fair intentionally.

Or more generally, just because Linux and Windows both behave in a certain manner doesn't make that de facto correct. It's all tradeoffs, and different people value different things. Correct-by-default (i.e. fair locks) is valuable, the only question is whether it's worth the performance hit (although you can always opt in to unfair locks to get better performance).

> I'm positive the people who maintain it are aware of the tradeoffs between fair and unfair locks and made the default fair intentionally.

I think you're neglecting a couple of less-technical factors here. Yes, fairness was certainly made the default for reasons at some point (but still, even then, one might argue that an opt-in solution might have been better). On the other hand, there's the likely possibility that those reasons don't really hold true anymore. Think of Scene Graphs vs. Entity Component Systems in high-performance videogame design - in this example the rise of caching made a whole architecture out-dated.

On the other hand, like removing the GIL in Python, such decisions are not to be taken lightly because of the things you will break. It's very likely that there are applications that would still have problems with starving threads, and just switching from opt-out to opt-in will make them break for no apparent reason in the strangest of circumstances. I know, Apple likes to break things more often than MS, but I'd guess that´s not a risk they're willing to take. Imagine you're updating your OS and a dozen apps that worked for a decade and don´t get updates anymore start behaving strangely.

So, it's not unreasonable to settle for a less-than-optimal solution that still keeps things working and only makes things slower in the worst-case scenario. That doesn't mean it's not open to criticism, though.

> Yes, fairness was certainly made the default for reasons at some point (but still, even then, one might argue that an opt-in solution might have been better).

I think an unfair or opt-in-fair solution was deemed worse than fair-by-default for the simple reason that the most straightforward way to implement mutexes is

  enter kernel mode
  maybe grab a spinlock if on SMP
  add yourself to the waiting list
  sleep until woken up

  do critical section

  enter kernel mode
  spinlock
  wake the first waiting thread
That's a functionality every mutex must have and it happens to give fair semantics for free. Making semantics weaker for the purpose of optimization (and not just for the sake of making application developer's life harder or future flexibility which may happen to never be needed) actually takes additional work on top of that.

Since we are talking about a uniprocessor desktop OS developed in the nineties, it's plausible they didn't care about mutex performance as much as today and giving this extra guarantee afforded by their simple implementation seemed reasonable at the time.

note that every half decent mutex implementation will not enter the kernel (in neither lock or unlock) unless the mutex is contended.
Apple can and does update APIs in ways that preserve old behavior for apps linked against older SDKs, specifically so old apps continue to work. They could do the same here with versioned symbols, if they wanted to switch mutexes to unfair-by-default. I believe they intentionally go with fair-by-default because it's the safe choice, the choice that guarantees program correctness, since most people don't think about this sort of thing and probably won't be able to figure out themselves when their locks should be fair or unfair. It's the same reason C11 and C++11 atomics default to sequential consistency when not otherwise specified; that's the slowest memory ordering, but it's the one that's guaranteed to be correct for all cases. If you know what you're doing you can override the default and pick something else.
> Apple can and does update APIs in ways that preserve old behavior for apps linked against older SDKs, specifically so old apps continue to work.

Fair point, although that still won't rule out applications that get updates (and thus still would have to be completely re-evaluated on the basis of such a case).

Also, wouldn't e.g. Homebrew also get those problems if you compile against the new SDK? (Non-Mac user here, so maybe I've got the wrong impressions on that...)

> I believe they intentionally go with fair-by-default because it's the safe choice [...]

Safety might be a big concern, but on the other hand, pthreads is a standard - if the article's right, and POSIX doesn't mandate fairness, you might still argue that this "addon" was better put into a custom solution than the other way around.

Then again, I've always suspected that strange implementations (or a total lack thereof) of POSIX must be one of the main reasons why Boost exists...