Hacker News new | ask | show | jobs
by CharlieDigital 1400 days ago
> because of personalization or analytics features

That's not the architectural reason for message queues in my experience.

Primarily, a message queue is used when there is a potential for a bottleneck in the overall application throughput where there is high cost of reacting to some event so instead of reacting to the event synchronously, you queue the event and react to it asynchronously.

Some very common use cases:

1. Your UI. Operating systems use a message queue to capture and forward inputs. Your mouse and keyboard events are all queued in a message queue.

2. Webhooks. Many webhook origins have timeouts on responses and your application must respond within a given window or it will be throttled or downgraded. In this case, the best practice is to queue the event to be processed asynchronously (that queue can be a simple database table with your own logic and wrapper around it or something like AWS SQS, Azure Service Bus, or Google Pub/Sub).

3. Mitigating Throughput Bottlenecks. Most applications have a read/write asymmetry so it makes sense to optimize your architecture to scale for reads. But what if your application occasionally has to handle a burst of writes? Should you size your infrastructure for that case? One approach is to proxy the writes through a queue so you can size the infrastructure for a maximum throughput that is managed by the queue. For example, instead of 1000 concurrent writes per second, a queue can capture the write mutations and trickle out only 100 concurrent writes per second. Instead of sizing your application to scale to handle 1000 writes per second, you only need to size your queue to handle that scale.

4. Resiliency. If a message fails, it can be retried according to whatever heuristics make sense for the domain. Sure, you can use a simply loop to retry, but every message queue provides some mechanism for handling retries, failed message delivery, and so on. If you decide to roll your own and log a failed call into a database to try it again later...well, you've effectively captured a message in a custom queue.

2 comments

User tracking and/or personalization tick 2-3 of your 4 boxes:

UI - Logging mouse movement, keyboard events, and other types of attention proxies.

Webhooks and/or Bottlenecks - calling out to either internal or third party classifiers, or more recently even generative models, for personalization based on user tracking data.

And I don't think this is just my unlucky experience. The fine article includes user tracking as one of only three explicitly enumerated reasons for queuing:

> Why? Because users increasingly expect a real-time experience. In use cases like order flows, webhooks, user tracking, etc. users expect to be able to see the new data in the user interface instantly, instead of having to wait for some background batch processing to periodically reload.

Of the explicitly enumerated motivations in the article:

1. "Order flows" - tracking/modification is often one of the higher latency items in order flows ("you might also like" / "what to order next" features).

2. "Webhooks" - often used for tracking/personalization

3. "User tracking" - ...this one is easy :)

Webhooks go far beyond personalization and tracking; it's a general purpose integration pattern.
Yes, and I've worked on browser-based games that are built on top of webhooks. But what percentage of people could forego webhooks entirely if they weren't doing any personalization or tracking? Or could at least get away with webhooks without any "real" queuing infra? I'd wager a large number.
Not really. Any payment system requires webhooks
Shameless (but relevant!) plug: we built Svix to help people send webhooks from their platform. With Svix you don't have to use message queues for webhooks, at least not from the sender side: https://www.svix.com/

Note: though message queues are great, they make asynchronous operations and interacting with external services much more resilient.