Hacker News new | ask | show | jobs
by Bogdanp 3109 days ago
Unless I'm misunderstanding, it sounds like your use case fits RabbitMQ perfectly. Pseudocode:

    while True:
      message = queue.consume()
      process(message)
      message.ack()

RabbitMQ will automatically put the message back on the queue when the consumer that pulled it disconnects w/o acknowledging it first. Alternatively, you could explicitly reject messages:

    while True:
      message = queue.consume()
      try:
        process(message)
      except Exception:
        message.reject()
        raise
If you're using Python you might want to check out Dramatiq[1]

[1]: https://dramatiq.io

1 comments

Not exactly. Tasks can last for weeks. It can run fine for several days and then die, and it needs to be requeued, until it is explicitly finished. In fact, we do use RabbitMQ to emit "status updates".

With RabbitMQ, I'd need to ack rightaway, otherwise it would re-send the message again after a while.

Very curious what type of work you're doing where atomic tasks can run for weeks at a time.
We have something like this, not weeks but days. Linear programs, integer math, using IBM Cplex to schedule "people" to do "things" at the ideal time.
In our case, data migration. Resumable, sometimes dies because of various reasons (or just infra rescaling happening), but takes very long to run.
> With RabbitMQ, I'd need to ack rightaway, otherwise it would re-send the message again after a while.

This is not exactly the case. RMQ will only re-enqueue the message when the consumer disconnects. If you're able to keep the consumer connection alive (this is easy to do with the heartbeat mechanism) for the processing duration, even if it takes a long time, RMQ should handle it fine. That said, if the connection between your consumers and RMQ is flaky, you'll have to make your tasks re-entrant.

Good point. I'd rather have something explicit going on. In other places where we do use RabbitMQ (for short-lived, non-critical tasks), the listening processes log reconnects every once in a while, even with heartbeat.
I've seen folk talk about progress messages for long-running tasks and jobs. If you have checkpointing then they play well together.