Hacker News new | ask | show | jobs
by fdr 4969 days ago
I think the article is good, however, I feel compelled to weigh in from the countervailing general direction:

I am a message queue skeptic. Not that they should never be used, but rather a general feeling that complex, dedicated message queue software is often used for engineering problems between two or three orders of magnitudes too small before they deliver value. And, for most projects, queue replacement is not too difficult, especially if one's use of a database-backed queue is relatively naive.

As such, I will -- with an open mind -- suggest that general purpose database management software masquerading as queues is not an outright antipattern, and the times to use them in this way is probably more commonly seen than the opposite, where a dedicated queue delivers clear value.

Here are my main reasons for thinking that, which almost entirely have something to do with being able to address one's queue and one's other data in the same transaction:

* Performance: a pedantically unsound but basically reasonable rule of thumb (as experienced by me) would suggest that one needs to be processing somewhere between hundreds or thousands of messages per second with at least fifty (and maybe up to a hundred or two) parallel executors processing or emitting messages before there are performance issues where the lower constants of a dedicated queuing package become attractive. Below this kind of throughput, one starts to experience some more pain than gain.

* Correctness: Most database + queue integrations do a lousy job of what is effectively two-phase commit between different data storage instances (so that would include two+ RDBMSes, even if they are of the same kind, e.g. 2xPostgreSQL), and frequently the queue has to be counted among these (exception: when the queue contents can be lost/can be rebuilt/is idempotent at all times). Systems do a lousy job of making this work because it's pretty finicky to do a good job in many situations, i.e. expensive and time consuming.

* Constants, when dealing with other systems: When one does do a good job and has interesting requirements in the 'correctness' case, it often means doing forms of two-phase commit, whether explicitly supported by the system (e.g. PREPARE TRANSACTION) or a spiritual equivalent via carefully thought out state machines. In principle these could be relatively cheap, but typically to avoid complexity more expensive approaches are employed, such as a couple extra UPDATE requests to poke at some home-grown state machine.

Also, my experience indicates that as systems evolve, there will be inevitable bugs in these state machines that, by nature, span systems. Be vigilant and make sure you get more value than pain, and try to avoid having too many of them.

* HA is still hard: clustering is generally in principle possible, but make sure you read the fine print. For example, many people use Redis as a queue, but it is not really unlike any other monolithic database most generally -- its main draw as a vanilla queue is good execution-time constants. The same could be said of Apache ActiveMQ in its least byzantine configuration. One might think that one would get a lot of leverage 'for free' given the simpler semantics of queues vs the diversity of access methods in most general purpose databases, but so far I have not seen that to be the case, for the very good reason that a lot of people expect a lot of reliability out of their queues (no less than the transactional nature of some databases), and doing that is either most natural in a monolithic system or slow, or complicated, or both in a multi-master distributed system.

All in all, if you think you need dedicated queuing software to send a few dozen emails a second (that's a lot of email for most people!), think twice: it might still be a good idea, but brace yourself for these pitfalls or convince yourself that they probably mostly don't apply to you.

3 comments

Initially this might seem different from the topic at-hand, but bear with me please:

Several years ago (probably first or second year of high school) I wrote and distributed amongst my friends a small desktop game. The program e-mailed me their new high-scores and internal game diagnostics daily. I found that this was too often so I rewrote the game to send these data daily. This way of retrieving data sucked. Often, I would get empty or unimportant messages.

Eventually I had the realisation that time-oriented polling was the wrong philosophy. Instead, there should be a "this event has happened now" (e.g. 500kb of diag. generated) algorithm that calls the polling routine/invokes the 'stuff to be done traverser'. If I understand the problem correctly, the mass e-mail example, you could make a simple counter as part of the HTTP request. When this counter hits a certain you-determined threshold value (e.g. there are 500 jobs to do), then call the traverse/processing code:

if jobs_to_do > thresh then invoke some async processing

The takeaway from this post is that, in my opinion, time-based (cron job-bish) polling algorithms are inefficient but can be replaced with a rudimentary event-driven ones.

And just BTW, 'countervailing' - I'm deifying this word. Holy s*. I'm going to use it all the time.

Thanks for your feedback, I appreciate the thoughtful response. I actually agree that generalized message queues can often be complex and perhaps even unnecessary when dealing with asynchronous processing at a small scale depending on your needs.

I think the important thing is to understand your requirements, the volume of jobs, etc. In my series, I also plan to introduce much simpler lighter work queues that are a perfect medium between a 'heavy duty' generalized message queue and trying to wedge a queue into a database.

But as with everything, people should evaluate the available options for themselves. My goal is just to provide people with a framework for understanding the tradeoffs.

Interestingly, that's why Microsoft created SQL Server Service Broker. They had te infrastructure for reliable message queuing and transactional support, so they created SQL Server Service Broker!

Not sure I ever took off though...

Unfortunately it didn't really took off, but it's a shame because it's a good, polished and complete implementation that allows for some advanced scale-out topologies (it can also be used as a foundation for data dependent routing and map/reduce scenarios).

<rant>I guess it's not much used because people like to reinvent the wheel every time (by manually implementing queues using tables with all the traditional concurrency problems) instead of learning something a bit more complex.</rant>

Anyway, it's not going to be thrown away anytime soon as it's used in other parts of the engine (e.g. SMTP mail integration, Server Events and Query Notifications).