Hacker News new | ask | show | jobs
by TheDong 1057 days ago
One of the linked fixes is: https://github.com/mastodon/mastodon/commit/13ec425b721c9594...

It seems like it was trivial to make it happen atomically.

There just wasn't a need to before since them not being atomic isn't an issue, unless you have a poor configuration like someone pointing sidekiq at a stale database server (sorry, a replica), which I see as the primary issue here.

4 comments

Maybe I’m missing something but I f it’s not atomic, it doesn’t matter whether there’s a replica or not: sidekiq (whatever that is) might do a read in-between step 2 and 3.

I see several problems in their setup really

- lack of strong consistency

- using eventually consistent data, the replica, to take business decision

- no concurrency control (pessimistic or optimistic)

I don’t know much about mastodon but, while not trivial, that’s pretty basic systems design concepts

> There just wasn't a need to before since them not being atomic isn't an issue

I disagree: there clearly is an issue with a non-local account having a null URI. It’s unlikely but totally possible for the server to crash inbetween query 1 and query 2, irrespective of database replication stuff. This is a textbook example of why you use database transactions.

I think even without that there was likely still at least theoretically a race condition.
OTOH, reading from a read-only copy reduces load on the master