|
His initials were LT, if that helps (if not, I can clarify his whole name over email or however people privately communicate on HN?). I may well be exaggerating his 'major'ness - I'm not that familiar with the Cassandra contributorship, but that was my understanding from my team! As for Cassandra itself, we were certainly quite sophisticated users. There's a decent chance you'd know the company in question. It's a fair point about the language: I'm not a fan of Java and it probably colours my opinion a little; I was speaking more from a philosophical standpoint about reducing the theoretical complexity of software to make it more deterministic & understandable (out of the tar pit and all that), more than to any specific deficiencies in Java that caused any actual issues for us, of which there were no direct examples I can recall. (Pathological GC did cause some occasional degradation, I suppose, though at best that's semi-specific to Java.) I think most of the actual issues, from my on-call years, were as a result of stuff like: (as I mentioned) anticompaction and suchlike causing pathological performance; our own misconfiguration of things like asynchronous replication / bootstrapping (which once caused a very severe incident, as in 'endemic data corruption and loss' severity); application-layer issues from product engineers misconfiguring consistency, choosing poor keys for partitioning, constructing poor data models that require table scans, all the ordinary stuff for which Cassandra is at most very obliquely to blame. Also, I agree it was stupid of us to use Cassandra in (what was) a very serious environment, in probably one of the most safety-critical sectors outside of medicine and rockets. We knew that. We did the same with several technologies. Literally, we had a diktat saying no engineers could mention it in blog posts. On reflection it's quite unfair to blame Cassandra for our decision - to a large extent, yeah, we were holding it very wrong. I would not have made that choice myself, at all. |
Cassandra was super easy to shoot yourself in the foot with, and it remains quite easy. You mention a few foot guns that have gotten better, a few that remain but that will get better, and a few that are sort of inherent to distributed databases.
Anti-compaction for instance should now not be a huge issue if you're running regular incremental repairs, and I hope even the few remaining caveats will be alleviated soon. Bootstrap is something you mention that is also going to get much easier for users this coming year, so that unsophisticated users can manage their cluster membership safely.
Application-misconfigured consistency levels is a really obvious one that isn't strictly Cassandra's fault but for which much better help could be given to the user, and I expect some major improvements here in the next year or two, so that users can configure tables with consistency properties that the database guarantees (to some extent, the user will always be able to screw it up by providing the wrong consistency identifier, but at least the scope is reduced to accidental misconfiguration rather than misunderstanding). This is something we're considering as part of the introduction of general purpose transactions later this year.
Poor data models and partition keys are things the database can offer less help with, though I anticipate much better support for ordered partitioning in future, that would help poorly-selected partition keys, as clustering keys can be used for partitioning there too.
Regarding the choice of language, Java has upsides and downsides. GC spirals are something we have control over at the end of the day, and we continue to do better at (as does the JVM), but guaranteeing no segfaults (and not worrying about the ABA problem) is a big benefit we get in return. I wish we had more control over things like memory placement and execution, but these things may be coming to the JVM to some extent (Loom I expect to benefit Cassandra hugely, and value types later), but equally distributed systems problems often give you enough things to worry about.
The visibility you have into a Java process is fantastic, however, and we are starting to make use of the ease with which you can modify the code Java runs for system validation, using byte weaving to permit us to simulate clusters as they are run, with adversarial event orderings, to ensure those notoriously hard distributed systems problems are correctly solved.
If you do want to speak privately, about anything Cassandra related, the lowercase part of my username (i.e. without the _ prefix) at apache.org reaches me.