Hacker News new | ask | show | jobs
by hammock 1822 days ago
> 2. Understand how liberating idempotency is

Can you give some examples?

3 comments

Say you have a batch job that performs tasks A–E, and each of those tasks might process thousands of records. At some point, something will go wrong that causes the process to crash, hang, or error over many records. If the code is not idempotent, you need to investigate exactly where things started going wrong and figure out how to resume the process at that point. You don't want to reprocess records and e.g. send out duplicate emails or double-increment some number you're tracking. If the code is idempotent, conversely, you can just start the whole process over again without having to worry about any of that.

Similarly, many systems involve consuming from some message queue. It's basically impossible to guarantee exactly-once delivery in most systems. You either have to risk missing a message, or having it delivered multiple times. If you're running idempotent code, you can always err on the side of redundant delivery without any ill effects.

Last week I ordered a screen protector for my phone. I got two boxes in the mail. I thought I ordered twice by mistake but they had the same order number on the packing slip.

My immediate thought was that some order processing step somewhere is not idempotent.

How do you make something idempotent if one of the effects is sending an email?
You can't make everything idempotent.

You should still try and make most things idempotent.

Side effects go to the border of the app. You keep track if for some message X you have sended the email, so you garantie that you send it once.
Idempotecy means you can run something several times and it will do the same thing.

Let's say you're controlling a factory and you have a function that fills a reservoir. A naive way would be to define the semantics of the operation as "send enough liquid" and open the circuit for 10 minutes when activated. Since it takes 10 min to fill, all is good.

But what if your program crashes and you have to rerun. If the reservoir was already semi-filled, you will overflow it.

If the semantics of the operation is "send liquid until 'full' sensor indicates you're done", that's liberating. You no longer have to worry about overflowing the reservoir.

And is another way to say this "side-effect free"?
The Linux shell command 'touch foo' is idempotent, but not side-effect free.

Reading a variable shared between threads without a lock is side-effect free but not idempotent.

Interestingly, `touch` is not idempotent because it modifies timestamps. Not being pedantic, just an interesting consideration. `mkdir -p` is idempotent, I believe.
You're absolutely correct, it's not idempotent for the crucial reason you mention. It's crucial because updating the timestamps is the main purpose of 'touch' in the first place!
Reading a variable isn't side-effect free either :P
In the sense that you're moving data in to a register and updating the program counter?
And potentially changing the contents of processor caches.
Opening Schrodinger's box isn't side-effect free. :)