Hacker News new | ask | show | jobs
by cpgxiii 969 days ago
At the end of the day, no matter what kind of code you are writing, you either have tools and processes in place that reduce the risk/mitigate the impacts of bugs, or there is always the risk of serious problems being introduced. An unknowing change that breaks parallelism in another component could just as well be an unknowing change that breaks authentication or defeats a security boundary.

Parallelism introduces an additional class of bugs, but they are fundamentally addressed the same way as any other class of bugs - e.g. testing, tools, and code review. If some_one_ can unknowingly break a system, that means the tools and processes weren't good enough.

2 comments

One difference from most other classes of bugs is that threading issues can be quite nondeterministic, which makes it harder to automatically disambiguate between flaky tests and real bugs being caught.

Also, the code introducing a race condition may get lucky when your CI system runs the tests and still make it into your main branch.

I agree that tooling (like static analysis, Rust's borrow-checker, etc) can play a big role here though.

That is the issue. It is very hard to write tests that ensure correct parallel code as it can easily work 99.9% of the time. This is not the case with typical functional requirements.
It is much the same case with security requirements, though. You can have all the tests of intended behavior, but they won't necessarily tell you anything about unintended behavior. You need better tooling and specifically focused tests to have confidence the code is correct and safe.
Please elucidate. Concretely, what tools and testing methods are you referring to?
For parallel code, the obvious answers are static and dynamic analyzers. E.g. for C and C++ you'd use TSAN and MSAN. The Rust borrow checker is essentially a memory/thread safety static analyzer baked into the compiler.

Particularly for dynamic analysis, you need to have test cases that usefully cover the design behavior. E.g. if you design a component to be safely shared, you need tests that exercise that sharing where the static/dynamic analyzer(s) will identify unsafe sharing. Likewise, if you know something is unsafe, you should probably have tests that demonstrate that the static/dynamic analyzer(s) do detect the unsafe usage.