|
|
|
|
|
by SanFranManDan
3021 days ago
|
|
I agree. However, I stumbled into the world of KV stores (like RocksDB, LMDB, LevelDB, etc..) last year, and what is most surprising is that they all stop in the same place. I understand that they should do one thing and one thing well, but it is still disappointing when you have to implement things like replication, sharding, and indexing yourself. There really aren't even that many DBMS that are KV (like redis) out there to handle it either. They are normally much more complicated (like adding SQL layer on top of it). |
|
Indexing requires knowledge of a higher level data model. (Again, BerkeleyDB has built in support for secondary indexing, but last time I checked it was a quite braindead and slow implementation. Faster to build your own indices instead, using the other facilities provided.)
With that said, while a KV store has no logical data model to apply to index generation, it can at least provide primitives for you to construct your own indices. BerkeleyDB and LMDB do this.
Distribution with transaction support may require help from the storage engine (offering something resembling multi-phase commit). BerkeleyDB provides this already; LMDB will probably provide this in 1.0.
An argument could be made that the storage engine should be able to handle replication/distribution even without understanding the higher level/application data model. BerkeleyDB does this with page-level replication. IME this results in gratuitously verbose replication traffic, as every high level operation plus all of its dependent index updates etc. are replicated as low level disk/page offset operations. IMO it makes more sense to leave this to a higher layer because you can just replicate logical operations, and save a huge amount of network overhead.
As for the possible higher layers - antoncohen's response below gives a few examples. There are plenty of higher level DBMSs implemented on top of LMDB, providing replication, sharding, etc.