|
|
|
|
|
by peterangular
2017 days ago
|
|
Yep - I have the HTTP error code detection dialed into an extreme because it's dumb to run a broken scrape anyway. Frankly just any errors - if I see more than say 5-10 jobs fail within a 2-3 minute time period things are designed to wait X time, try again... and stop if they're still encountering errors and ping me to come in and investigate. Faulty retry logic is just as dangerous as the forked/distributed run-off situation. |
|