Is it just me or does looking at write logs to determine read consistency just seem like a really bad way to do this vs a different load balancing strategy?
Definitely feels like bandaid, over a more robust solution like sharding their main database by repository. Or splitting off some of their tables to another database. Their load of "millions of requests per hour", doesn't sound particularly high either.
At some point their primary is going to get overloaded again by the writes. And they'll have added all of this machinery for nothing. They've also made the replicas an essential part of their stack. Whereas their previous stack could probably tolerate a certain amount of replicas going down or replica lag. They now have a system that will grow and be dependent on all the replicas being available. Until a heavy write load or heavy table alter, causes the replicas to become lagged at which point a higher percentage of traffic will go to the master and potentially cause downtime.
Yikes. Excellent point. Is sharding the only option here? It seems like having immensely shared Postgres is one way of solving the problem, but is that the only option?
At some point their primary is going to get overloaded again by the writes. And they'll have added all of this machinery for nothing. They've also made the replicas an essential part of their stack. Whereas their previous stack could probably tolerate a certain amount of replicas going down or replica lag. They now have a system that will grow and be dependent on all the replicas being available. Until a heavy write load or heavy table alter, causes the replicas to become lagged at which point a higher percentage of traffic will go to the master and potentially cause downtime.