|
|
|
|
|
by ComNik
4059 days ago
|
|
Yes, I fully recognize the problem with double-writing.
I will definitely try out Bottled Water. I was also thinking about replacing Kafka with a much simpler, lower-throughput system (because we are lightyears from LinkedIn's requirements). Two reasons why I can't just use postgres (I'd love to):
1.) Kafka (or whatever queue we settle on) will be used for logs and metrics as well, data that doesnt flow through postgres. 2.) Postgres stores the data-model of my business-domain, at the lowest, normalized level. But derived data-stores are inherently denormalized and I want to be able to use them without talking back to my source-of-truth all the time. So currently I'm passing DTOs to Kafka, just like I would to any API request. This data is not easily available at the postgres-level. I'm not yet sure on the right abstraction level for events. It seems very natural to have them contain information that I would send to clients directly. |
|
We have an application that might be similar. It receives analytics events from frontends. It uses (currently) RabbitMQ to distribute it to multiple "sinks", including InfluxDB, ElasticSearch and websockets; the main sink is one that stores the events as flat files (one JSON hash per line) in S3. That's what we consider our master data.