|
|
|
|
|
by wmsiler
2668 days ago
|
|
From the post, FaunaDB initially had several issues, which they've generally resolved. Jepsen is open source, so I'm curious why a database company wouldn't run Jepsen internally, work out as many problems as they can, and then engage aphyr in order to get the official thumbs up. Given how important data integrity is, I would assume that any database company would be running Jepsen (or something equivalent) regularly in-house. If they are doing that, then how is it that aphyr finds so many previously unknown issues? And if they aren't running Jepsen in-house, why not? |
|
However, correctness testing is fundamentally adversarial, like security penetration testing. Building a database is not easy, and testing a database is not easy either. It is a separate skill set, as anomalies that lingered for decades in other databases reveal. The engagement with the Jepsen team is explicitly designed to explore the entire product surface area for faults, not to apply Jepsen as it currently stands. Thus, a lot of custom work ensued on both sides to make sure that the database was both properly testable, and properly tested. The result of that work is what you see in the report.
The typical Jepsen report implicates not just implementation bugs, but the entire architecture of the system itself. Jepsen usually identifies anomalies that cannot be prevented even with a perfect implementation, which didn't happen here.
Some vendors restrict their engagement with the Jepsen team to only what they have tested themselves already, although those tests are not always valid. This was not our mindset—we wanted to improve our database by taking advantage of Kyle’s expertise, not present a superficially perfect report that failed to actually exercise the potential faults of the system.