Hacker News new | ask | show | jobs
by rescrv 4768 days ago
I've not fully compared the code, but at first glance, they're choosing different constants for compaction (such as level size, and ratio of compacted-to-uncompacted data) and have done-away with the non-overlapping invariant for the first few levels of the tree. I've not benchmarked their code, so I don't know how effective this strategy is.

What I can see from their code, is that they rely on a compaction strategy that is very similar to stock LevelDB (in terms of selecting the next uncompacted SSTs within a level), that they haven't done anything to improve multi-threaded performance, and that they've invested a lot of work into computing if/when writes should be delayed within the LevelDB code. We've drastically changed the compaction strategy, begun to improve concurrency (I know for a fact that we have opportunities to improve it further), and we believe that throttling writes at the storage layer is not the correct level to make such decisions, so we've removed the code which does so.

I guess the most fair thing to say is that we've taken complementary approaches, and nothing from either approach is not portable to the other.

1 comments

As the person doing the github:basho/leveldb work, I agree with the above (very professional reply, not something you expect to see on the Internet, thank you). We are optimizing to our individual environments. I do need to review the compaction algorithms to see if there is benefit to Basho. However, it will be a few weeks before that happens.

The write multi-threading does not help Basho's Riak 1.x series. We parallelize by using multiple leveldb database (Riak term vnodes) and do NOT parallelize individual database until 2.0. Therefore, unfair to measure that feature agains Riak.

Our compaction code is adjusted with an emphasis on running multiple compactions of varied priority. Our use of multiple databases got hung up on leveldb's single compaction thread. Again, this complete difference of environments would be unfair in a Hyperleveldb direct comparison.

And the most difficult issue for dropping hyper into Riak is that our leveldb performs the write throttling, hyperleveldb leaves that to HyperDex. This is yet again an environment design decision ... but says coding is require to make hyperleveldb "just work" with Riak. That will be a while.

I therefore do not claim that Basho's leveldb would be better with HyperDex and suspect that today's hyperleveldb would not be better with Basho's Riak. We optimized to our different pain points.

Matthew