Hacker News new | ask | show | jobs
by blundergoat 115 days ago
We treat webhooks as at-least-once delivery over an unreliable transport and design for duplicates and out-of-order events.

A few rules that have saved us:

- Persist before responding. Never process inline. Write payload to DB, return 200 fast.

- Idempotency key required. Either provider event ID or hash the payload.

- Async worker processes from queue. Exponential backoff + max attempts.

- Dead letter queue + dashboard. Humans need visibility.

- Alert on backlog growth, not single failures. One failure is noise. A growing retry queue is signal.

- Relying on provider retries alone has bitten us more than once.

1 comments

Thank you so much for tips! I was feeling nervous about relying on provider retires as well. I especially like the idea of alerting on backlog growth. There's nothing I hate more than a bunch of emails and notifications!
This was a nice goat exchange