|
|
|
|
|
by bluedonuts
3597 days ago
|
|
This is cool. I dont think it would be useful in my case though. 99% of the time the cause of lag I see is some heavy inserts on the master. These hit the slaves at the same time so this wouldn't mitigate that issue. It got me to wondering, what is the common cause of slave lag for other people? |
|
- say one of the slaves actually has some monitoring system break and now runs out of disk space, and stops the ability to write locally, so the lag will go up and/or health check will see its replication is failing, and it will automatically be taken out of the pool.
- say you run a site where a user has a list of songs they like. some users have larger libraries than others, but there is a few outliers that have 100x the amount of songs. your growing fast and you have some code paths that are not optimized for this, and some features that folks barely use that you don't spend time on. one of these users goes on your site and uses this feature and now you have 1 slave that is lagging because the overly intense query landed on it.
- say you have a high traffic site, or you are in a datacenter where you are sharing networking gear with a very high traffic neighbor, and the datacenter has over provisioned the networking gear, so there is no headroom for bandwidth. now you have slave lag if your replication is passing thru a switch at saturation. maybe this is only present some times of the day, on some slaves, so they will be taken out of the pool so servers with better network paths are prioritized.