Hacker News new | ask | show | jobs
by pjdesno 80 days ago
I don't actually get your point.

You dismissed the standard lock-guarded data structure as a "bogus comparison", despite it being the way every programmer is taught to write multi-threaded code.

Now the more you write, the more you seem to make the case that (a) normal programmers shouldn't be writing code like this, and (b) there are significant speedups possible if someone who knows what they're doing *does* write a highly tuned lock-free library.

1 comments

The easy speedup is to use 2 mutexes, one that protects head and tail_cached, and the other that protects tail and head_cached, and align so they don't interfere. In other words, take the RingBufferV5 from the article and define the class like this:

  std::array<T, N> buffer_;
  alignas(64) absl::Mutex hmu_;
  std::size_t head_{0};
  std::size_t tail_cached_{0};

  alignas(64) absl::Mutex tmu_;
  std::size_t tail_{0};
  std::size_t head_cached_{0};
Then change the code to forget the atomics and just use the locks. On my system this is more than ten times faster than the baseline naïve thread-safe RingBufferV2. That's what I mean about using a bogus baseline.