Hacker News new | ask | show | jobs
by eventreduce 2250 days ago
The biggest usecase for EventReduce is realtime applications. Most technologies for these like Firebase, AWS AppSync etc. work on non-relational data. If you want to use EventReduce with relational queries, you have to make them non-relational before, for example by using materialized views. If you do not want to do that, you should not use this algorithm in its current featureset.
2 comments

My guess would be that if you're at a scale where you're thinking about these sorts of things, you are also at a scale where you're running on multiple machines. How does EventReduce share writes across the cluster?
EventReduce is an algorithm and not a database-wrapper. It will not care about your writes or if your database layer is a cluster and so also not affect them.
Sorry I wasn't clear in my original post.

I'm thinking about the application layer. If you have an application that writes data to a table, it's typical to run multiple instances of that application to support scale and reliability requirements.

If I send a write to one instance, how does it communicate and synchronise that write with the other application instances?

I ask because this can be a tricky thing to do, especially when consensus is required, as consensus algorithms such as Raft/Paxos require a number of network roundtrips which will introduce latency, and actually account for much of that latency in the database examples given in some cases.

EventReduce is a simple algorithm. It does not care or affect how you handle propagation of writes or how you handle your events, transactions or conflicts.

See it as a simple function that can do oldResults+event=newResults like shown in the big image on top of the readme.

This means then that if you run multiple application servers, which most do, that you’ll need to implement a data distribution mechanism of some sort.

I must admit, with limitations like this I’m struggling to figure out the use cases for this.

Edit: so I guess this is easier using the change subscriptions you mention in other comments. That does mean many subscribers, but hopefully that’s minimal load. This has the trade-off that it’s now eventually consistent, but I suppose that’s not a problem for many high read applications.

I’m still feeling like this could be solved in a simpler way with just simple data structures and a pub sub mechanism. Now I think of it, we do a similar thing with Redis for one service, and a custom Python server/pipeline in another, but we’ve never felt the need for this sort of thing.

Do you have more details about specific applications/use cases, and why this is better than alternatives?

I think the best example for why this is useful is described by david glasser at his talk about the oplog driver used in meteor.js https://www.youtube.com/watch?v=_dzX_LEbZyI
thank you for the clarification