|
|
|
|
|
by evanelias
2462 days ago
|
|
That is simply not correct. Please understand that I worked on MySQL at Facebook, so I know what I'm talking about here :) Facebook developed an entirely new MySQL storage engine (MyRocks, which is a RocksDB-backed engine for MySQL) and then migrated their largest sharded tiers to it. This is basically just as much work as developing a new database from scratch, i.e. more work than something like migrating to Postgres. This completely debunks the "changing is basically impractical" claim. And while Facebook's primary db tier (UDB) does have a restricted API / access pattern, calling it a "key-value store" is a gross oversimplification at best, or completely inaccurate at worst. Range scans are absolutely core to the UDB access pattern, for starters. Many other social networks are also built on MySQL (linkedin, pinterest, tumblr; and several in China) or previously used MySQL before moving to a custom in-house db (twitter). I think reddit and instagram are the only two using pg? And I recall parts of instagram were being moved to mysql, although I'm way out-of-date on whatever happened there. |
|
I disagree for three reasons:
1. The long tail of code using MySQL at the company, like at any large software company, is prohibitive. You would have to maintain MySQL and PostgreSQL in parallel for years. A new storage engine, on the other hand, is controlled by one team.
2. Migrating from InnoDB to MyRocks consists of successively adding MyRocks replicas, letting them catch up, and removing InnoDB replicas. That is a dramatically easier proposition than migrating tiers to PostgreSQL.
The fact that RocksDB was a hard technical project is kind of irrelevant. The new storage engine provided major wins and could be done within a team, while migrating to PostgreSQL would provide at most small improvements and demand changes to huge amounts of code and massive data migration projects. That makes the former project deeply practical and the latter impractical. If the usual stack back in the day had been the LAPP stack instead of the LAMP stack, we would be having this discussion the other way.
> calling it a "key-value store" is a gross oversimplification at best
That's fair. The right thing to have said would be that the query patterns that are used are extremely simple selects over a single table, which is a place that MySQL has traditionally shone. MySQL's query planner still does strange things on complex queries from time to time. I had a case about six months ago where one shard decided it was going to reorder indexes in a query and load everything in the database's core tables before filtering it down instead of using the proper index order like the other nine hundred something shards. Easily fixed once we realized it (we forced the index order in the query), but the fact that we had to... I have heard that this has all gotten much better in MySQL 8.0.