Hacker News new | ask | show | jobs
by tesseract 1891 days ago
The article didn't really go into the details of the implementation but my mental image was basically your second proposal: the deduction of the fee should happen in the same database transaction as the updating of a "fee-last-charged" timestamp, relying on the database for thread safety and allowing the cron job itself to be stateless (and inherently protected against simultaneous execution of multiple instances of itself).
1 comments

This could still lead to a double-charge, though.

If some rogue deploy script creates fifty of the cron job, and weirder things happen every day, one of those could be checking the stale state during the transaction of another one. A classic data race.

Eliminating this problem is left as an exercise for the interested reader...

The way to fix that is to remove state altogether.

  1. Balance is not a simple variable, but the sum of all credits and debits to an account
  2. A fee is a charge record in your database
  3. This fee has a database constraint that you can have only one record per month
Now you can run the script that charges dormant fees as often as you want.
This is exactly the right approach, and the easiest way to implement idempotency in many realistic systems. Instead of thinking about idempotency in verbs - "perform action iff action has not yet been performed" - think about it in nouns - "create a piece of data whose key is the tuple of its inputs".

Practically speaking, this narrows your "transaction window" significantly - instead of:

  1. Begin transaction
  2. Check to see if work has already been done
  3. Do work
  4. Persist work
  5. Commit
With a potentially long transaction spanning from 1-5, you do this:

  1. Do work
  2. Persist work to key / table with uniqueness constraint
  3. On conflict, do nothing (looks like you already did the work before)
Of course if "Do work" is very expensive, you can bring back in "Check to see if work has already been done" as an optimization, but for many simple CRUD examples, it's actually /cheaper/ to learn that the work has already been done via the conflict check failing than via an explicit pre-flight check.
At least one of the two cron jobs will fail to commit its transaction though.