Hacker News new | ask | show | jobs
by sudeepj 2003 days ago
Why not merge the improvements into RocksDB itself?
2 comments

There are mainly three reasons here:

1. We changed the source code too much that we are not able to merge it back to RocksDB easily (This project started at 2016 as an close-source project) 2. We have different road path with RocksDB (e.g. We will remove a lot of un-used code to make TerarkDB much more light-weight than current version in the future) 3. We have lots of third-party partners (e.g. Intel, on Opatane SSD/Memory and others with ZNS...) may participant in this project so we want to handle all commits ourself to make sure everything is under control.

It's open source now, right? Outside of 2 and 3, could someone incorporate (some) of the improvements from TerarkDB into RocksDB? Or does it truly require some major rewrite to achieve the tail-latency benefits?

The comparison figures presented looked really impressive, thanks for sharing it.

3) is not in line with an open source philosophy.

EDIT: Detrimental to the original. Eg. Amazon forking and selling MongoDB.

First, it’s reeaallllyyyy expensive to invest enough in an open source project that you have a reasonable chance of steering it.

Second, even if you do the first, the whole thing gets screwed up again when you start trying to introduce vendor code into the mix. Generally, no one upstream gives a crap that you have super compelling business reasons to compromise on code quality (or even trivial things like how code is committed: tarballs vs good git hygiene), and vendors sometimes compromise a lot.

So it’s not surprising that sometimes groups choose to do the expedient thing to get something to market instead of doing things “the right way.” In a lot of respects, the original Android did this with Linux.

Competition is good.

> In a lot of respects, the original Android did this with Linux.

Android vendors keep doing this over and over again with Linux, which explains why so many phones are stuck on old versions of Android.

Imagine if there were multiple incompatible and competing linux kernels. What we have now is AMD/MS/Apple etc... contributing to the kernel through "vendor code". Imagine if AMD released a AMDLinux and Nvidia had NvidiaLinux.
This already happens, because most (?) people aren’t running vanilla kernels. Many (most?) distros compile their kernels with config options and patches that “make sense to them.” In the most egregious cases, you end up with things like bpf being intentionally broken by default.
It is perfectly in line with open source philosophy to be able to fork a project and have control over my fork. Especially given 2 where they have different goals from upstream.
No necessarily true in this case. They are compatible and it's merely a performance improvements from the code.
Amazon did not fork mongodb, they won’t touch AGPL code, they reimplemented the server side protocol and a backend implementation on top of postgresql afaics.
How so? Unless they are stopping normal users from committing code as well?
Even if they stopped normal users from committing it would still be adhering to open source philosophy.
It feels like this is healthy, organic and very much in line with the ethos of open source to see a project take this path and arrive back in open source. If the rocks team wanted to cherry pick some compatible advancements from this project they are now free to do so.

There are much more egregious and fundamentally different violations to open source namely those you mention in your comment.

Sure it is; it’s exactly equivalent to something like forking Linux with the reasoning “I want to be the BDFL now” — eg the nvim fork
Wasn't the driver for nvim specifically disagreements with the direction/priorities/steer of the project? Is progress in a different direction necessarily a bad thing, especially if that effort couldn't be directly applied to the original anyway?

Please someone feel free to correct me, but if I recall correctly a lot of the improvements in Vim 8 were a result of the popularity of functionality in NeoVim?

You're correct -- which is why I've used it as an example of forking for project-control reasons to be perfectly in line with an open-source philosophy.
No disagreements there. Contention is it's not good for the original.
I didn't know this. How do I contribute to Oracle's Unbreakable Linux or Redhat's RHEL? I know I can fork them, but not sure how I can push my commits into their code and didn't realize that was required!
I did not say it was required. But you can always contribute.
(3) is exactly how SQLite is developed
> Eg. Amazon forking and selling MongoDB.

Are they giving back the source? And letting Mongo merge their changes if they wish?

Because that's what open source is all about.

Leadership or steering committee is a key factor for open source projects operated by companies. A closed pull request with comment "We won't accept the pull request because ..." should not be on the trajectory of an infrastructure project, which is to be/being widely used by any giant vendor.

So RocksDB came from LevelDB and here we go again.