Uber has not switched from Postgres used as RDBMS to MySQL used as RDBMS, they switched from Postgres used as RDBMS to MySQL used as key-value storage layer of homegrown sharded non-relational database.
This has pretty much no bearing on anyone using Postgres or MySQL in reasonable way.
Exactly. I think the prior HN discussions failed to call out how painful it is to do any sort of schema migration against a big InnoDB table [1][2].
Well known MySQL uses such as Facebook TAO and this Uber Schemaless are typically abstractions built on top of MySQL, which means the schemas are pretty much static, and they don't feel the schema migration pain.
For a typical RoR startup that relies on a RDBMS, please, stay away from MySQL.
[1] Yes, I know about the INSTANT ADD COLUMN patch from Tencent Games that landed in MySQL 8.0, and which has had major bug fixes in at least 8.0.14 and 8.0.20.
[2] A side effect is that MySQL now has a thriving ecosystem of schema migration tools (pt-osc, lhm, gh-ost), while Postgres has none, and there are situations where there is indeed no choice but to rewrite the table, e.g. changing a column type from int to bigint.
* Facebook had extremely frequent schema changes, and powerful declarative schema management automation to support this
* The TAO (or more correctly "UDB") use-case supported using many separate tables, not one giant generic key/value table as people seem to assume
* The non-UDB MySQL use-cases at Facebook, in combination, are still larger than the vast, vast majority of all other companies' databases. These non-UDB databases use a wide range of MySQL's functionality. The frequent claims that "Facebook used MySQL just as a dumb K/V store" are absolutely incorrect and have never been correct.
I love Postgres just as much as anyone but Uber use case still seemed to be a better fit for MySQL. I was hopeful this would kickstart a renewed focus on features / architecture within thr Postgres community and I’m not certain anything resulted from this. Hope to be wrong obviously.
The main point, clustered vs. nonclustered indexing, is architectural, and not inherently inefficient; it depends on the use case.
"Highly advanced" databases give both options, but AFAIK, MySQL/PGSQL will likely not offer this, at least for a very long time, since it requires radical changes.
On the one hand, MySQL has offered this for two decades, by virtue of pluggable storage engines being core to its design. Some storage engines use clustered indexes and some do not. The user can decide which one matches their use-case; very large companies can design their own custom special-purpose storage engines; etc.
On the other hand, mixing storage engines in a single db instance has operational downsides (especially re: crash-safe replication). And InnoDB is by far the dominant storage engine, and is probably unlikely to offer nonclustered indexing, so from that perspective I agree with your point.
I still remember the "good" ol days when the default MySQL engine was MyISAM. Even after InnoDB became the default, a lot of people were still configuring it for MyISAM for their 30 user webapp because they'd heard it was faster, and besides, you can ensure data integrity in code, right?
I made a bit of money freelancing on "my database for my LAMP stack app is corrupt!" issues by a) demonstrating that InnoDB wouldn't slow down their webapp in any measurable form and then b) trying to save and normalise as much data as possible.
It'll be interesting to see how things shake out when some of the other implementations using postgres's pluggable storage API start maturing. I wonder if it'll have some of the same operational downsides that mixing storage in MySQL has.
Good question. I assume it depends on how Postgres handles multi-engine transactions, and how it stores replication state metadata. A good discussion of the issue in MySQL/MariaDB is here: https://kristiannielsen.livejournal.com/19223.html
Apparently MariaDB 10.3+ has this solution implemented, which is cool, never knew that before. I don't think there's anything equivalent in MySQL.
The replication was redone around that time (not sure which version exactly). It's still working on the same principles though, sending queries and redoing them on each replica.
Before in short, the WAL was sent every minute and always 10MB even if there were no changes. Now it's more adaptive, actually doing nothing when they are no changes, and picking up quicker when changes begin.
I am surprised they don't mention this point because the replication was really unusable in PostgreSQL.
There are still spikes (write amplification) and other drawback from this design, but at least it doesn't shit itself under no activity.
I don't understand this. Why would it be sending queries and redoing on a replica and sending the WAL? Just sending the WAL would seem to be sufficient, or alternatively: sending queries would be redundant if you just send the WAL and apply directly at the secondaries.
It doesn't make sense to you because everything in the comment you're replying to is wrong.
Neither log shipping (copying WAL files one by one) nor streaming replication (sending a stream of WAL) works by sending queries. WAL segments are 16MB by default, and the default archive_timeout is 0, not 1 minute (and the archive timeout is not applicable to streaming replication anyway). There is also nothing "adaptive" about the replication—when there is no traffic, there will be ~no changes, and when there are changes, they will be sent to the replica.
I don't understand what the comment is suggesting used to happen in periods of no activity that made replication unusable, but it is also probably incorrect, and has nothing to do with the write amplification problem.
I was calling the WAL the "queries" to simply, never mind that, it doesn't matter whether it contains the queries or not.
What's important is that the WAL was generated on a periodic basis and of a constant size. Say 16MB every minute. It's pretty much a plain file, that could be stored on S3/FTP.
This had a lot of drawbacks:
- Replicas were measurably late behind the current state, simply because of the built-in delay in "replication".
- It was incredibly inefficient on bandwidth and storage. Consider the time it takes to transfer large files (especially for off-site replicas) and storage costs. That further contributed to poor performance and delay.
- There could be many WAL files generated at once when there were changes happening. They would take FOREVER to be processed. It was commonplace for replicas to fall 5-10 minutes under what I consider to be minimum activity.
Long story short, the replication was reworked in a later version of PostgreSQL (3 or 4 years ago), the part about fixed size and fixed delay is not true anymore.
In streaming (physical) replication, PostgreSQL sends the WAL only, and it's applied on the replica; no "sending of queries" is involved, or even possible; with physical replication, the secondary has to keep itself identical with the primary, otherwise replication will fail. This is why you can't use physical replication across major versions, since they can't be bit-for-bit identical.
In more recent versions there's "logical replication", which sort of "sends the queries", in that the secondary node has its own database state that does not have to be exactly identical with the primary, allowing for replication across major versions.
In my opinion though, unless you really need logical replication for some reason, stick with streaming replication. It's much easier to understand and there are fewer failure modes.
> In more recent versions there's "logical replication", which sort of "sends the queries"
What it sends is not the queries, but a logical description of the changes to each row that were made by running the query. So an UPDATE that changes N rows would generate N changes to be applied to the corresponding rows (usually identified by primary key) on the logical replica, not a single update that had to be "re-executed".
Uber has not switched from Postgres used as RDBMS to MySQL used as RDBMS, they switched from Postgres used as RDBMS to MySQL used as key-value storage layer of homegrown sharded non-relational database.
This has pretty much no bearing on anyone using Postgres or MySQL in reasonable way.