| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hibikir 1050 days ago
	I have had to make that Clustered RabbitMQ to Kafka move myself, as the failure modes from RabbitMQ we're very scary. The most scary thing in the entire infrastructure in that financial institution levels of scary. It's not that it failed much, but you don't need many middle of the night calls with no good SOP to get the cluster back to health before migrating is in the cards. Kafka is not operationally cheap. You probably want a person or two that understands how JVMs works, which might be something you already have plenty of, or an unfortunate proposition. But it does what is on the tin. And when you are running fleets of 3+ digits worth of instances, very few things are more important.

1 comments

AtlasBarfed 1050 days ago

I have a dim view of almost all inherently single-node datastores that advertise a clustered hack (and they are hacks) as a patch-on (yes, even PostgreSQL). Sure it will work in most cases, but the failure modes are scary for all of them.

A distributed database will have network failures, will have conflicting writes, will have to either pick between being down if any of the network is down (CP) or you need a "hard/complex" scheme for resolving conflicts (AP). Cassandra has tombstones, cell timestamps, compaction, repair, and other annoying things. Others databases use vector clocks which is more complex and space intensive than the cell timestamps.

It's tiring to have move fast break things attitudes applied to databases. Yeah, sure your first year of your startup can have that. But your database is the first thing to formalize, because your data is your users/customers, you lose your data, you lose your users/customers. And sorry, but scaling data is hard, it's not a one or two sprint "investigate and implement". In fact, if you do that, unless you are doing a database the team has years of former experience with in admin and performance, you are doing it wrong.

"AWS/SaaS will eliminate it for me"

Hahahahaha. No it won't. It will make you life easier, but AWS DOESN'T KNOW YOUR DATA. So if something is corrupted or wrong or there is a failure, AWS might have more of the recovery options turnkeyed for you, but it doesn't know how to validate the success for your organization. It is blind trust.

AWS can provide metrics (at a cost), but it doesn't know performance or history. You will still need, if you data and volumes are any scale, how to analyze, replicate, performance test, and optimize your usage.

And here's a fun story, AWS sold its RDS as "zero downtime upgrades". Four or five years later, a major version upgrade was forced by AWS .... but it wasn't zero downtime. Yeah, it was an hour or so and they automated it as much as they could. But it was a lie. And AWS forced the upgrade, you had no choice in the matter.

Most clustering vendors don't advertise (or don't even know) what happens in the edge cases where a network failure occurs in the cluster but the writes don't propagate in the "grey state" to all nodes. Then the cluster is in a conflicted write state. What's the recovery? If you say "rerun the commit log on the out of sync nodes" you don't understand the problem, because deletes are a huge wrench in the gears of that assumption.

From my understanding of Cassandra, which kafka appears from the numerous times I've looked to be similar too with quorums and the like, it's built on a lot of the partition resilient techniques.

And, kafka has undergone Jepsen: https://aphyr.com/posts/293-jepsen-kafka

For those that don't know, aphyr will embarrass any distributed system given enough time. What is important is that

1) the distributed system is willing to subject itself to him and

2) they have a satisfactory response.

For an example of an unsatisfactory response, I give you MongoDB:

https://jepsen.io/analyses/mongodb-4.2.6

Note the "updates" section doesn't actually have them retry/repeat the testing. MongoDB just ran from the report. They claim it was fixed.

Anyway, if a system doesn't do that (submit to jepsen testing), then IMO it is hiding some big big big red flags.

link

bit_flipper 1050 days ago

Not your main point, but MongoDB didn't commission Kyle to do that report as they had in the past, he did it on his own time. That's why his report doesn't mention repeat testing. They do actually run his tests in their CI and those new tests were used to isolate that specific bug. Moreover, some of the complaints about weak durability defaults for writing were later fixed: https://www.mongodb.com/blog/post/default-majority-write-con.... They still do default to a weak read concern, but writes are fully durable unless you specifically change the behavior. For what it's worth I agree with Kyle that they should have stronger defaults, but I don't really see a problem with MongoDB's response to the report because there is room to disagree on that.

link

AtlasBarfed 1050 days ago

Do you have a source for this? I got the impression at the time that there was some commissioning of his services, but that they didn't like the report. But he publishes work, and released the report, which forced them to deal with it.

Every distributed tech fails when he test it, but the tenor and nature of the report for MongoDB was different. It basically said between the lines "do not use this product".

MongoDB has a history of really crappy persistence decisions and silently failed writes, and as soon as it gets publicized saying "we fixed it in the next release". The same thing happened here of course. I simply don't trust the software or the company.

Mysql has the same annoying pattern in its history, although I have more confidence in the software because of the sheer number of users.

Still, I would probably pick PostgreSQL for both relation and document stores.

link

bit_flipper 1049 days ago

Source for which claim? Kyle was paid for work testing 3.4.0-rc3[1] and 3.6.4[2] which analyzed single document concurrency in a sharded configuration. Those tests run in their CI [3]. MongoDB had some somewhat misleading copy on their website about the result of those tests, so Kyle decided to test the new multi-document transactions feature for 4.2.6 and found some bugs.

It's fair to not trust the database or company, I don't blame you for that. But I think Kyle's MongoDB 4.2.6 report was not nearly as concerning as his PostgreSQL 12.3 report which found serializability bugs in a single instance configuration, among other surprising behaviors. MongoDB's bugs were at least in a new feature in a sharded configuration. I don't think his most recent report was actually as negative as it may read to you. I say this as someone who mostly runs PostgreSQL, by the way!

As a side note I believe there are consistency bugs existing right now in both MongoDB and PostgreSQL (and MySQL and Cassandra and Cockroachdb and...) waiting to be discovered. I'm a jaded distributed systems operator :)

[1] https://jepsen.io/analyses/mongodb-3-4-0-rc3

[2] https://jepsen.io/analyses/mongodb-3-6-4

[3] https://github.com/search?q=repo%3Amongodb%2Fmongo+jepsen&ty... (note: not an expert in when or what suites it runs, just have seen it running before as a demo)

link