|
|
|
|
|
by _benedict
485 days ago
|
|
We have something very similar[1] we use in the Apache Cassandra project to test complex cluster behaviours. We appear to use exactly the same basic technique, using byte weaving to intercept concurrency primitives such as synchronized, LockSupport etc to pause the system thread and run them on some schedule. We only currently run (deterministic) probabilistic traces though, we can’t search the interleaving space. But the traces for a whole cluster are extremely complex and probably unsearchable. I have been meaning to publish it for broader consumption for years now, but there’s always something more important to do. It’s great to see some dedicated efforts in this space. [1] https://github.com/apache/cassandra/tree/trunk/test/simulato... |
|
It seems that all controlled threads are wrapped with `InterceptibleThread` in the Cassandra simulator. Does this work for ThreadPools (e.g., ForkJoinPool) as well? We had a hard time intercepting thread objects because they are used by the language runtime (e.g., GC threads) as well and we don’t want to interfere with them. Additionally, modifying application code just track thread creation isn’t ideal. To work around this, we came up with this combination of JVMTi and Java Agent solution and we use JVMTi to monitor thread creation and termination.
As for searching schedules, yes, it is hard to search all possible schedules. However, it turns out many searching algorithms such as probabilistic concurrency testing[1] or partial order sampling[2] are still better than random walk. So it is worth to give them a try.
[1] https://www.microsoft.com/en-us/research/wp-content/uploads/... [2] https://www.cs.columbia.edu/~junfeng/papers/pos-cav18.pdf