Hacker News new | ask | show | jobs
by trungaczne 3477 days ago
I don't get the point you're trying to make. Aren't std::memory_order's guarantees supposed to work with all support archs?

The code in the OP does use memory fences. Are you implying that their implementation are incorrect?

1 comments

I'm not implying their implementation is incorrect. Just that these types of things are very easy to get wrong and when you do it's usually the type of bugs that take months to track down after you've eliminated every other subsystem involved.

Generally if there's not a huge organization putting their reputation(and $$$) on the line there is going to be bugs.

Most of the time if you're going lockfree for performance reasons there's usually much large gains to be found in your cache usage or overall architecture.

Generally if there's not a huge organization putting their reputation(and $$$) on the line there is going to be bugs.

This argument applies to any hard problem, so it doesn't seem valid. Whether there's an important bug in a project depends on someone's skill and on how much time they've dedicated to it, and it's hard to know how skilled or dedicated someone is.

When the prior against any particular implementation being correct is so high, I think it's correct to not trust any new implementation without strong evidence that it is correct, even if one is not aware of any specific issues. Personally I wouldn't adopt a new lock-free structure implementation without at least one of established backing or a formal proof of correctness.
Yeah, the more I think about it the parallels with hand-rolled crypto are pretty strong, although if you get crypto wrong you don't just crash.
You hit it, but it's not a side point.

The entire problem of rolling your own security code is that you will never know if you messed it up.

For functional code this is only an issue with silent corrupted data.

this is less about "skill" but about the awareness how the different CPUs are implemented and where the algorithm is not behaving correctly in conjunction with the CPU spec.

In addition the error class is a mean one: doesn't happen often statistically and difficult to reproduce and as such can be very expensive to track down.

The specs are quite clear about memory fences. Just because something has a failure mode that's hard to detect doesn't mean that luck has anything to do with implementing it correctly. And if luck isn't a factor, then that leaves skill and dedication.
Specs/hw can have bugs too and he never said anything about luck.

I have no issue with hard problems but the accountability for concurrency issues is gnarly. I've had driver issues look like concurrency bugs and concurrency bugs look like driver issues. If you feel the need to take on concurrency you better have the schedule budget for it or be willing to throw it away.