| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by trungaczne 3525 days ago
	I don't get the point you're trying to make. Aren't std::memory_order's guarantees supposed to work with all support archs? The code in the OP does use memory fences. Are you implying that their implementation are incorrect?

1 comments

vvanders 3525 days ago

I'm not implying their implementation is incorrect. Just that these types of things are very easy to get wrong and when you do it's usually the type of bugs that take months to track down after you've eliminated every other subsystem involved.

Generally if there's not a huge organization putting their reputation(and $$$) on the line there is going to be bugs.

Most of the time if you're going lockfree for performance reasons there's usually much large gains to be found in your cache usage or overall architecture.

link

sillysaurus3 3525 days ago

Generally if there's not a huge organization putting their reputation(and $$$) on the line there is going to be bugs.

This argument applies to any hard problem, so it doesn't seem valid. Whether there's an important bug in a project depends on someone's skill and on how much time they've dedicated to it, and it's hard to know how skilled or dedicated someone is.

link

lmm 3524 days ago

When the prior against any particular implementation being correct is so high, I think it's correct to not trust any new implementation without strong evidence that it is correct, even if one is not aware of any specific issues. Personally I wouldn't adopt a new lock-free structure implementation without at least one of established backing or a formal proof of correctness.

link

vvanders 3524 days ago

Yeah, the more I think about it the parallels with hand-rolled crypto are pretty strong, although if you get crypto wrong you don't just crash.

link

marcosdumay 3524 days ago

You hit it, but it's not a side point.

The entire problem of rolling your own security code is that you will never know if you messed it up.

For functional code this is only an issue with silent corrupted data.

link

je42 3525 days ago

this is less about "skill" but about the awareness how the different CPUs are implemented and where the algorithm is not behaving correctly in conjunction with the CPU spec.

In addition the error class is a mean one: doesn't happen often statistically and difficult to reproduce and as such can be very expensive to track down.

link

sillysaurus3 3525 days ago

The specs are quite clear about memory fences. Just because something has a failure mode that's hard to detect doesn't mean that luck has anything to do with implementing it correctly. And if luck isn't a factor, then that leaves skill and dedication.

link

vvanders 3524 days ago

Specs/hw can have bugs too and he never said anything about luck.

I have no issue with hard problems but the accountability for concurrency issues is gnarly. I've had driver issues look like concurrency bugs and concurrency bugs look like driver issues. If you feel the need to take on concurrency you better have the schedule budget for it or be willing to throw it away.

link