Hacker News new | ask | show | jobs
by pradn 2087 days ago
The ordering keys feature supports a large number of keys (though since the throughput limit is 1 MB/sec per key, many applications shouldn't have issues scaling up on a given key).

Imagine you have an order processing system where you have to 1) write to a database 2) write to a metrics log 3) and send an email to the customer. You can publish a message with the ordering key being the user who initiated the order. This means you are guaranteed to see message 1 before 2, which is seen before 3.

You do have to account for possible message re-deliveries. In this example, you can 1) write to the database with a order's unique ID (to prevent duplicate rows) 2) be fine with duplicates for metrics since a bit of duplication is okay (or maybe you have a job later that removes duplicates offline) 3) and be okay with sending emails to customers twice (pretty harmless). You may also keep a side-cache of processed messages to reduce the processing of duplicates, but that's a bit heavy and may not be necessary.

What Cloud Pub/Sub with ordering keys gets you in this scenario is 1) durability of published messages 2) scalability across keys 3) ordering between messages in a key 4) retries in case one step fails 5) buffering in case your subscribers are slow or down 6) a fully hosted service (no dealing with your own cluster, scales automatically) 7) global availability (no need to shard your subscription by region, simplifying your app).

Disclaimer: I work on Cloud Pub/Sub, but this explanation is my own.

1 comments

Thanks for this explanation. Can you give a similarly concrete example of how, according to the docs: "When you receive messages in order and the Pub/Sub service redelivers a message with an ordering key, Pub/Sub maintains order by also redelivering the subsequent messages with the same ordering key. The Pub/Sub service redelivers these messages in the order that it originally received them." I'm a little confused about what scenario with ordering would lead to the need to re-send multiple messages.