Hacker News new | ask | show | jobs
by ryanjshaw 4352 days ago
Thanks for those links. There seem to be some good concepts in there for formulating a solution to my present personal challenge: integrating multiple disparate sources of real-time events, some transient and being delivered with low latency, and some persistent but retrieved with high latency (up to 30min!), which needs to be analyzed (and potentially replayed and reanalyzed), producing a best-effort real-time feature stream while populating (and repopulating) a reliable data warehouse. It's taking me a long time to break the problem down to the right level of components.

> As soon as a queue tries to have durable messages it is becoming a database and has all of the problems a database has to deal with. Instead you could keep data storage being solved by the data storage provider and let queues focus on passing messages.

Yes; mind you, that doesn't exclude the queue from having a persistent backing store (to reduce the instances where your application has to be involved in replay), it just means applications shouldn't use queues as the golden source of events.