Hacker News new | ask | show | jobs
by madanMus 4084 days ago
[I am one of the authors of the paper.]

The paper reports overhead numbers from existing research. For instance, see Figure 18 in http://arxiv.org/abs/1312.1411, which shows the cost of SC for memcached - 1% on x86 and 3% on ARM.

This overhead is primarily due to the cost of fences on existing hardware. What we (not so cheekily) say is this is likely to get better as hardware platforms provide better implementations for fences.

1 comments

Hi, thanks for replying.

> The paper reports overhead numbers from existing research. For instance, see Figure 18 in http://arxiv.org/abs/1312.1411, which shows the cost of SC for memcached - 1% on x86 and 3% on ARM.

But that's the bit I don't find nearly convincing enough. You say (p.5) that you're going to "rebut these arguments" that "SC is too expensive", but the main figure of 1%/3% is from a non-standard research-level static analysis tool that, if I read that paper correctly, works on codes up to 10k LOC and takes a few minutes to run, producing the 1%/3% figure. Can that really be generalised? I'm not quite sure. The other tools in comparison did much worse, which I think may be closer to what one would get in practice. So I think that's not a good rebuttal: if you consider the tools actually available SC may well be too expensive.

I'm not saying you're wrong, just that I don't think you've proven your case very clearly. I was kind of expecting a much clearer rebuttal than I found, sorry about the snark.

yes, you did read correctly. The 1-3% does assume a non-standard whole-program analysis. Something more practical on existing hardware will look more like the E numbers (for escape analysis): 17.5% on x86 and 6.8% on ARM. An even dumber analysis (H) adds up to 37.5% overhead on x86.

It is important to realize that the overhead numbers are not huge, like 5x or 20x, to simply write SC off.

As we say in the paper, these overheads, however small they might be, will be unacceptable for low-level libraries written in C/C++ programs. The main argument of the paper was that these overheads are acceptable for "safe" languages like Java/C# which anyway add other (useful) overheads such as garbage collection, bounds checking etc. to help out the programmer.

Even for C/C++, it will still make sense to have SC by default - the programmer is responsible for explicitly turning off SC semantics in parts of the code where the inserted fences hurts performance. This is a much better tradeoff - safe by default and performance by choice.