Hacker News new | ask | show | jobs
by Dave3of5 1498 days ago
A little note about read replicas and problems I've discovered. It's often the case in code that you write a value to the DB then immediately read it back often in the form of a different query somewhere else.

If you are setup to do some kind of a round robin read from the read replicas you can often get a different read from what you wrote as the value hasn't replicated to your read replicas yet. The solution is to use the write endpoint when reading after a write.

He says that here but just wanted to point out that it can happen inside an api and cause real issues with data.

1 comments

This depends on the database and consistency level it’s enforcing. You can often configure databases to require an ack from the replicas before it returns, so that you’ll be able to read your writes. This obviously has a trade off with speed.

Some databases are cleverer about this. Things like Spanner and FoundationDB work differently so as to be fast to both read and write, but they’re much more complex to operate and use.

There is another quick trick though… if a client performs a write, set a bit in their session that causes all of their reads to come from the primary database for a short period, maybe a few seconds, just enough to cover the replication latency. This is a hack. It’s got a lot of downsides, but it’s a quick way to patch the problem if truly necessary.

I think that's an interesting idea.

Here's a post that benchmarked multi-region Postgres (Elixir/Phoenix on fly.io): https://nathanwillson.com/blog/posts/2021-09-25-fly-multi-db...

According to the post, for some users (residing in Japan, and the primary instance being located in Amsterdam), a query could take ~200ms (median). If multiple queries are performed for each request, that could mean 1 second or more per API call - not so great if that's the case for multiple seconds after each write. I think this would eventually lead to putting more code in stored procedures, begging the question: why not use a distributed DB like Fauna in the first place?

Alternatively, the replication problem could be accounted for in the app itself. E.g. the SPA or the edge instance could retry reads following a write until the change from the primary instance has propagated, and up until then pretend that everything went fine. In case a write isn't replicated within 10 seconds or so, show an error to the user and let them retry the write action. This could lead to duplicate entries, but I'd estimate the chance for that to be quite low.

Yeah 200ms to the database feels essentially useless for a webapp. I'd normally expect <5ms for a typical app, and <1ms is ideal – that usually means the same rack or very close.

This time all adds up. For a moderately complex webapp something like 10-20 queries isn't unheard of (even when you're careful about N+1 queries and caching). If each of those is taking ~5-10ms, add the network latency, and you're at ~50-150ms, which is pretty much your time budget for most end-user-facing webapps.

This is one of the problems with compute at the edge, you have to be much smarter about these things because there are a lot more network roundtrips between your app and database than there are between your customer and your app. Edge replicas help but still complicate things.

That's actually like what was done on the project in question. If you use a transaction it was set to always use the primary that way you can read and write all you want in your queries.

For this ACK that you talk about. It's AWS aurora mysql specifically here do you know if that's a setting you can setup?

Sorry no idea about AWS Aurora, but I think MySQL will do this itself.
Unfortunately, it won't. Traditional MySQL replication does not provide a built-in option for fully synchronous behavior.

You can optionally use MySQL's "semi-sync" replication feature, in which replicas ACK receipt of transactions. This is purely for durability though, not consistency: it ensures commits are durable even if the primary DB has an unrecoverable hardware failure, but without having the massive latency penalty of fully-synchronous replication.

With semi-sync, the replicas are essentially just confirming they've queued up the transaction in their local relay log, rather than acknowledging execution of the transaction. The replicas may still be lagging behind the primary, in terms of what transactions they've applied locally; this means they'll return stale reads.

That said, the story is completely different in AWS Aurora, which (by default) uses proprietary physical (storage-level) replication. Within one region, AWS docs say their lag is "usually much less than 100 milliseconds", but they also note lag depends on transaction volume, so it's unclear what sync/async tradeoffs they're making behind the scenes.

In any case, for a nice third-party implementation of read-after-write consistency in MySQL, ProxySQL has a really powerful feature: https://proxysql.com/blog/proxysql-gtid-causal-reads/