Hacker News new | ask | show | jobs
by j_baker 5387 days ago
This solution isn't necessarily terrible.

The only caveats are that of performance (with a traditional server I wouldn't worry about performance until you need to process hundreds of items per second, but on EC2 nodes that threshold is more near the range of dozens per second), and the need to regularly archive the "done" directory (cron solves this nicely).

...but why would you worry about these problems when other solutions like kestrel, beanstalk, and redis (my personal favorite) are equally easy to set up and understand?

And for that matter, how do you give multiple machines access to this workqueue?

1 comments

As you point out, both of those are excellent points at which you should consider a "real" queuing system :)
Yeah, but why not just skip the intermediary step and use a "real" queueing system to begin with? It doesn't sound to me like it's any more effort in the short term or in the long term, and it's one less thing you have to worry about as you scale.
Gonna play devil's advocate here:

I think making files in a folder represents the least amount of effort for making a queue. So using one of the systems you described is necessarily more work.

I think the commenter outlined the reasons: any process can access the data with simple unix commands, and everyone understands files.

Plus files could be more efficient. What if the work unit you are processing are files? If the files are the work and the folder is the queue, you don't need any extra abstractions to access the data.

Because then you have to admin the real queueing system. If it's something simple, sometimes the one-time cost of re-solving the problem is less than the ongoing cost of dealing with that damn queueing system every time someone wants to set the app up on a new host, or a new dev wants to work with it, or it crashes, etc.

Fine line for when either approach is appropriate.

Beyond the novelty of doing it as a learning exercise the first time 'round, I agree that your approach is better where there's any expectation of scale.