Hacker News new | ask | show | jobs
by tonyhb 1129 days ago
From an SRE, one of their DB clusters failed. They use Vitess which is great, but it can be prone to hotspots and doesn't auto-shard. Heavy usage (esp. from large customers, rogue jobs) can take down the cluster. When it goes down, it's a PITA to resolve.
1 comments

This literally isn't true and looks awfully like the talking points of one of our competitors.
Ah, unbalanced shards via wrong sharding keys was an issue at one point, IIRC. I remember talking with an SRE there when something bad happened at GitHub last year, and I know that this time the current DB cluster failed.

To be clear, I _was_ mapping previous incidents with this year's incident — no competitor or hard feelings involved. I really like Vitess, fwiw. And the only thing I really love is FoundationDB :)

That wasn't clear.

Side note: "Autosharding" is largely a myth that unproven databases are touting. Sharding is complex and requires planning and control. Databases that start shuffling data round without oversight produce nasty surprises. Trying to be too magic is normally always a mistake with databases.

Yeah, fair, totally get it. Wasn't aiming to spread FUD, and I know that FDB is a little hard to compare against... it is pretty magic with how it routes and shards :D (https://forums.foundationdb.org/t/keyspace-partitions-perfor...)
For posterity: https://github.blog/2023-05-16-addressing-githubs-recent-ava...

It was the DB, and it was rogue usage on May 10, so I'm standing by my original comment

What would you know, random Hacker News commen--oh. Hi Sam, carry on.
<3 Hey Sarah!