Hacker News new | ask | show | jobs
by peterangular 2017 days ago
Yep - I have the HTTP error code detection dialed into an extreme because it's dumb to run a broken scrape anyway.

Frankly just any errors - if I see more than say 5-10 jobs fail within a 2-3 minute time period things are designed to wait X time, try again... and stop if they're still encountering errors and ping me to come in and investigate.

Faulty retry logic is just as dangerous as the forked/distributed run-off situation.