|
|
|
|
|
by cookiecaper
3603 days ago
|
|
You're obviously qualified so I'd love to get more info on how you deployed MySQL replication. I have a low opinion of it after using it at my employer for the last few years. MySQL allows writes to slaves (yikes) and the binlog coordinates are a constant hassle. ROW, STATEMENT, and MIXED binlog formats can make interoperability a mess and further increase the risk that there will be discrepancies between slave and master data (for some reason, we're on STATEMENT and/or MIXED replication, which means that the slave receives a list of the queries to execute and may have different output than the master). To sync up a new slave, you have to set up slave_skip_errors to ignore primary key conflicts and other common errors that occur because it's impossible to really get a firm binlog coord (it's really only marginally possible due to Percona's efforts with innobackupex/mydumper). I know that GTIDs are available now and are supposed to resolve the issues with binlog coords, but we haven't activated/used this featureset yet. Is it really a magic bullet? MySQL just seems to do replication in a very unstable way that no other major RDBMS is willing to approach. I use PostgreSQL replication on a personal project and while it takes a smidgen longer to set up (maybe 20 minutes instead of 5), it seems well worth it to me. I never have to worry about whether my slave's output matches the master (because the query is performed on the master and the result shipped out via the WAL and that's the only way to use replication). I don't have to worry that I connected to the wrong endpoint and accidentally wrote to the slave, causing a conflict that requires a new full dump and resync. I don't have to worry that binlog coords are going to be wrong or that a write will go in at just the wrong moment and potentially require me to redump and resync the whole thing again (but at least require me to skip errors). How do you handle these issues with MySQL replication, and justify the risks in order to shave a few minutes off the upfront setup time? |
|
Both statement and row based replication can be very reliable in modern versions of MySQL. It is my experience the ways data is corrupted are: 1. read_only not being set on slaves, so random users can write to the slave. We set read_only on startup based on service discovery. 2. Bad automation for failovers. See https://github.com/pinterest/mysql_utils/blob/master/mysql_f... for how we do it. 3. Crashes without all the durability settings being on.
If you are having to run slave_skip_errors, you are doing it wrong. You should checkout out our automation for backups and restores. They can be found in mysql_restore.py and mysql_backup.py .
With regard to PG replication, I suggest you watch https://m.youtube.com/watch?v=bNeZYVIfskc&t=26m54s Uber had a 16 hours outage in large part caused by pg replication issues.
Vacuum issues are also no joke.
There are a host of other issues: MySQL can deal with large numbers of connections, PG needs middleware. MySQL is more efficient with "web" workloads where most queries only need to pull on row. etc...
-Rob