|
|
|
|
|
by antirez
4265 days ago
|
|
Yep, Jepsen is more suitable to check systems that claim either linerizability, or at least write safety, during partitions. I guess that a modified version of Jepsen could be used in order to validate the failure modes or to discover other unexpected ones that at human inspection look easy to reproduce in actual production environments. Also I don't know if Jepsen is good at this, but in theory it could be instrumented in order to check how good the implementation is, which is, even if it is not designed for write safety during partitions, how better the countermeasures are working? |
|
On the other hand, it sounds like Redis Cluster offers few hard guarantees; instead, it promises that failures should be rare 'in practice'. Which is a fine thing for a tool to do, of course, but it makes things less amenable to the kind of stress-testing Jepsen does -- since running inside Jepsen's little universe is about as far from normal operation as you can get. If you already know that a system can fail in a certain way, getting Jepsen to reproduce that failure tells you very little.
If you'd like to make this kind of testing possible, it would be useful to state as many 'positive' rules as possible, which Redis Cluster should always respect -- things like "if a majority of nodes are fully connected, they should always accept writes" and "an unpartitioned cluster should always agree on the same value" -- alongside the documentation on ways it might fail. This way, clients can be assured of the 'bare minimum' that the system supports, and tools like Jepsen can give you more useful information.