Hacker News new | ask | show | jobs
by thu2111 2006 days ago
Some of the benefits of this come from separation of values from the keys. This is an increasingly widely used technique: it is described in the WiscKey paper [1] and is also used in the PingCAP fork of RocksDB. It seems Chinese companies like forking RocksDB, I am not sure why, perhaps the combination of firewall+language barrier just makes it easier to fork and move fast than try to work regularly with upstream.

By separating out large values there's less write amplification and things get faster because more of the SSTs fit in RAM cache. RocksDB wasn't historically a great choice to hold things like file uploads - you'd use the traditional filesystem for that. But that's quite constraining. When large values work better, it not only is a performance increase, but it enables new software designs too.

[1] https://www.usenix.org/system/files/conference/fast16/fast16...