Hacker News new | ask | show | jobs
by glogla 1939 days ago
Sure, but it does not really matter.

Uber has not switched from Postgres used as RDBMS to MySQL used as RDBMS, they switched from Postgres used as RDBMS to MySQL used as key-value storage layer of homegrown sharded non-relational database.

This has pretty much no bearing on anyone using Postgres or MySQL in reasonable way.

2 comments

Exactly. I think the prior HN discussions failed to call out how painful it is to do any sort of schema migration against a big InnoDB table [1][2].

Well known MySQL uses such as Facebook TAO and this Uber Schemaless are typically abstractions built on top of MySQL, which means the schemas are pretty much static, and they don't feel the schema migration pain.

For a typical RoR startup that relies on a RDBMS, please, stay away from MySQL.

[1] Yes, I know about the INSTANT ADD COLUMN patch from Tencent Games that landed in MySQL 8.0, and which has had major bug fixes in at least 8.0.14 and 8.0.20.

[2] A side effect is that MySQL now has a thriving ecosystem of schema migration tools (pt-osc, lhm, gh-ost), while Postgres has none, and there are situations where there is indeed no choice but to rewrite the table, e.g. changing a column type from int to bigint.

To echo and add to rwultsch's sibling comment:

* Facebook had extremely frequent schema changes, and powerful declarative schema management automation to support this

* The TAO (or more correctly "UDB") use-case supported using many separate tables, not one giant generic key/value table as people seem to assume

* The non-UDB MySQL use-cases at Facebook, in combination, are still larger than the vast, vast majority of all other companies' databases. These non-UDB databases use a wide range of MySQL's functionality. The frequent claims that "Facebook used MySQL just as a dumb K/V store" are absolutely incorrect and have never been correct.

Thanks for the insight! So how frequent does schema change happen to UDB?
Sorry, I don't recall the exact frequency, I left FB in 2015.
Then I guess my point about UDB doesn't require schema change will still stand then? Looking at https://github.com/facebookarchive/linkbench, the table structure looks pretty much "final". And judging by https://dom.as/2015/07/30/on-order-by-optimization/, the access pattern is pretty much fixed as well.
UDB had hundreds of tables per shard, and although there are a few common patterns, they did not all have an exactly identical structure. You have two former FB database engineers in this thread (myself + rwultsch) telling you your statement is incorrect. Nothing in Domas's post discusses lack of schema changes or tables being identical.

Linkbench is unmaintained and does not attempt to mirror the entirety of UDB, just its access patterns: point lookups by PK, and range scans over a secondary index. A fixed access pattern is not the same thing as having no schema changes.

Even putting column changes aside, the entirety of UDB was migrated from InnoDB to MyRocks in 2017, which is essentially a schema change across every single UDB table in every single UDB shard.

And besides, as I mentioned already, the non-UDB MySQL use-cases at Facebook are larger than the vast majority of companies' databases -- larger than the next-largest US social network, even. The non-UDB tiers had dozens of schema changes every single day.

As rwultsch correctly mentioned, Facebook's extreme agility with schema changes is directly what inspired me to create https://www.skeema.io, an open source project offering declarative schema change management. It's used by GitHub, Twilio, and a number of other well-known companies.

Please stop making incorrect statements based on things you have no direct experience with.

FB had plenty of schema changes. I know, I wrote software to push them out. The important concepts for pushing schema at scale became part of Skeema.
What about Postgres native key-store called HStore?

https://www.postgresql.org/docs/8.3/hstore.html

I love Postgres just as much as anyone but Uber use case still seemed to be a better fit for MySQL. I was hopeful this would kickstart a renewed focus on features / architecture within thr Postgres community and I’m not certain anything resulted from this. Hope to be wrong obviously.