|
|
|
|
|
by tdrhq
672 days ago
|
|
> This is actually not an easy thing to do. If your shutdowns are always clean SIGSTOPs, yes, you can reliably flush writes to disk. But if you get a SIGKILL at the wrong time, or don’t handle an io error correctly, you’re probably going to lose data. Thanks for the comment! This is handled correctly by Raft/Braft. With Raft, before a transaction is considered committed it must be committed by a majority of nodes. So if the transaction log gets corrupted, it will restore and get the latest transaction logs from the other node. > I’m sorry, but I don’t think this was as persuasive as you meant it to be. I wasn't trying to be persuasive about this. :) I was trying to drive home the point that you don't need a massively distributed system to make a useful startup. I think some founders go the opposite direction and try to build something that scales to a billion users before they even get their first user. |
|
I’m now completely lost as to why you believe this was a good idea over using something like MySQL/Postgres/Aurora. As I see it, you’ve added complexity in three different dimensions (novel DB API, novel infra/maintenance, and novel oncall/incident response) with minimal gain in availability and no gain in performance. What am I missing?
(FWIW, I worked on Bigtable/Megastore/Spanner/Firestore in a previous job. I’m pretty familiar with what goes into consensus, although it’s been a few years since I’ve had to debug Paxos.)
> I was trying to drive home the point that you don't need a massively distributed system to make a useful startup. I think some founders go the opposite direction and try to build something that scales to a billion users before they even get their first user.
This reads to me as exactly the opposite: overengineering for a problem that you don’t have.
For exactly the reasons you describe, I would argue the burden of proof is on you to demonstrate why Redis, MySQL, Postgres, SQLite, and other comparable options are insufficient for your use case.
To offer you an example: let’s say your Big Customer decides “hey, let’s split our repo into N micro repos!” and they now want you to create N copies of their instance so they can split things up. As implemented, you’ll now need to implement a ton of custom logic for the necessary data transforms. With Postgres, there’s a really good chance you could do all of that by manipulating the backups with a few lines of SQL.