|
|
|
|
|
by ZephyrBlu
1161 days ago
|
|
Let me get this straight, you're complaining about eight 9s of reliability? 50,000,000 * 7 = 350,000,000 2 / 350,000,000 = 0.000000005714286 1 - (2 / 350,000,000) = 0.999999994285714 = 99.999999% > It's not a ton, but it's not zero. And when it comes to data durability the difference between zero and not zero is usually all that matters. If your system isn't resilient to 2 in 350,000,000 jobs failing I think there is something wrong with your system. |
|
It's not reliability we're talking about, it's about durability. For reference, S3 has eleven 9s of durability.
Every major queuing system solves this problem. RabbitMQ uses unacknowledged messages which are pinned to a tcp connection, so when that connection drops before acknowledging them they get picked up by another worker. SQS uses visibility timeouts, where if the message hasn't been successfully processed within a time frame it's made available to other workers. Sidekiq free edition chooses not to solve it. And that's a fine stance for a free product, but just one I wish was made clearer.