| Looking at the diceDB code base, I have few questions regarding its design, I'm asking this to understand the project's goals and design rationale. Anyone feel free to help me understand this. I could be wrong but the primary in-memory storage appears to be a standard Go map with locking. Is this a temporary choice for iterative development, and is there a longer-term plan to adopt a more optimized or custom data structure ? I find the DiceDB's reactivity mechanism very intriguing, particularly the "re-execution" of the entire watch command (i.e re-running GET.WATCH mykey on key modification), it's an intriguing design choice. From what I understand is the Eval func executes client side commands this seem to be laying foundation for more complex watch command that can be evaluated before sending notifications to clients. But I have the following question. What is the primary motivation behind re-executing the entire command, as opposed to simply notifying clients of a key change (as in Redis Pub/Sub or streams)? Is the intent to simplify client-side logic by handling complex key dependencies on the server? Given that re-execution seems computationally expensive, especially with multiple watchers or more complex (hypothetical) watch commands, how are potential performance bottlenecks addressed? How does this "re-execution" approach compare in terms of scalability and consistency to more established methods like server-side logic (e.g., Lua scripts in Redis) or change data capture (CDC) ? Are there plans to support more complex watch commands beyond GET.WATCH (e.g. JSON.GET.WATCH), and how would re-execution scale in those cases? I'm curious about the trade-offs considered in choosing this design and how it aligns with the project's overall goals. Any insights into these design decisions would help me understand its use-cases. Thanks |
I'm skeptical that the re-execution approach can scale for complex queries, the latency and throughput improvements would be offseted by the computational cost and bottlenecks introduced for achieving it via its reactivity mechanism (query subscription), this might not work at scale and serve niche use cases.
There are various ways throughput and latency for kv stores can be improved, so bar is really high here.
The messaging with Dice seems unclear and confusing to describe its purpose/use-cases over alternatives, or how it achieves them, which could just be how it's marketed. But it seems to be a collection of ideas and a WIP project.
I think reducing data fetching complexity and complex key dependencies for end clients could be appealing, and it would be great to have it at the KV store level, but there is no reason this type of reactivity can't be implemented on top of various clients for existing KV stores (like Redis). And basic WATCH with transactions are even offered out of the box in them.
Deno kv seem nice but its vendor locked, also there are many others like dragonfly, valkey etc, redis could still work, even something over sqlite can work, deno has a selfhosted kv on top of sqlite - https://github.com/denoland/denokv
Also with dice its creator had made this talk
https://hasgeek.com/rootconf/2024/sub/how-we-made-dicedb-a-t...
From that and the thread so far it seems, they want to make some super cache by building a realtime multi-threaded kv store, improving latency and reducing its read load via its reactivity mechanism. Solving the problem of cache invalidation.
Not sure how this will be achieved but there is no harm in trying. From what is said and shared, rationale behind this design and its tradeoffs are not clear, code could be fixed/improved but providing clarity on this is essential for adoption.