|
|
|
|
|
by fiznool
1798 days ago
|
|
One of the nice things about a pull-based model (polling) is that you are in control of throughput. Need to process events more slowly? Increase the polling interval. It’s impossible to achieve the same thing with push-based (webhooks), you are at the mercy of the producer’s rate of webhook delivery. I had this issue a few years ago with a queue-as-a-service that sent jobs via webhook - the queue would intermittently drain extremely quickly, sending thousands of requests per second, which totally overwhelmed our poor single Heroku dyno. One big issue with the pull-based model though is with concurrency. If you have multiple workers polling an API endpoint for new data, you need to synchronise the ‘last seen’ ID or timestamp across all workers. Otherwise, worker A and worker B might pull the same data and you could end up with duplicates. There’s no silver bullet here, either model requires work to harden against edge cases. |
|