Say you have a batch job that performs tasks A–E, and each of those tasks might process thousands of records. At some point, something will go wrong that causes the process to crash, hang, or error over many records. If the code is not idempotent, you need to investigate exactly where things started going wrong and figure out how to resume the process at that point. You don't want to reprocess records and e.g. send out duplicate emails or double-increment some number you're tracking. If the code is idempotent, conversely, you can just start the whole process over again without having to worry about any of that.
Similarly, many systems involve consuming from some message queue. It's basically impossible to guarantee exactly-once delivery in most systems. You either have to risk missing a message, or having it delivered multiple times. If you're running idempotent code, you can always err on the side of redundant delivery without any ill effects.
Last week I ordered a screen protector for my phone. I got two boxes in the mail. I thought I ordered twice by mistake but they had the same order number on the packing slip.
My immediate thought was that some order processing step somewhere is not idempotent.
Idempotecy means you can run something several times and it will do the same thing.
Let's say you're controlling a factory and you have a function that fills a reservoir. A naive way would be to define the semantics of the operation as "send enough liquid" and open the circuit for 10 minutes when activated. Since it takes 10 min to fill, all is good.
But what if your program crashes and you have to rerun. If the reservoir was already semi-filled, you will overflow it.
If the semantics of the operation is "send liquid until 'full' sensor indicates you're done", that's liberating. You no longer have to worry about overflowing the reservoir.
Interestingly, `touch` is not idempotent because it modifies timestamps. Not being pedantic, just an interesting consideration. `mkdir -p` is idempotent, I believe.
You're absolutely correct, it's not idempotent for the crucial reason you mention. It's crucial because updating the timestamps is the main purpose of 'touch' in the first place!
Similarly, many systems involve consuming from some message queue. It's basically impossible to guarantee exactly-once delivery in most systems. You either have to risk missing a message, or having it delivered multiple times. If you're running idempotent code, you can always err on the side of redundant delivery without any ill effects.