|
|
|
|
|
by inaseer
2053 days ago
|
|
Concurrency is hard - this was a great investigation into the bug using TLA+ by the author which suggested simplifications to the code in addition to the bug fix. This reminded me of another reader/writer concurrency bug posted by Tom Cargill in the c2 wiki caused by surprising behavior of Java monitors, which Hillel Wayne showed how to discover and then fix using TLA+ at https://www.hillelwayne.com/post/augmenting-agile/ We were able to reproduce and fix the same bug using Coyote as well, and documented our experience at https://cloudblogs.microsoft.com/opensource/2020/07/14/extre... I was pleasantly surprised at the finding and was definitely heartened by our experiment where we were able to reproduce the bug in actual working C# code without the need of an additional modeling step. While TLA+ is undoubtedly very powerful when it comes to modeling and catching concurrency bugs, I do hope tools like Coyote get wider adoption and are implemented for more languages so bugs in critical infrastructure code can be caught in a scalable and repeatable way while staying within just one language for implementation _and_ testing. |
|