Hacker News new | ask | show | jobs
by ncann 780 days ago
Could it be that RDBMS is just inherently very complex? Reminds me of this classic comment about Oracle Database:

https://news.ycombinator.com/item?id=18442941

To quote part of it

> Oracle Database 12.2.

> It is close to 25 million lines of C code.

> What an unimaginable horror! You can't change a single line of code in the product without breaking 1000s of existing tests. Generations of programmers have worked on that code under difficult deadlines and filled the code with all kinds of crap.

> Very complex pieces of logic, memory management, context switching, etc. are all held together with thousands of flags. The whole code is ridden with mysterious macros that one cannot decipher without picking a notebook and expanding relevant pats of the macros by hand. It can take a day to two days to really understand what a macro does.

> Sometimes one needs to understand the values and the effects of 20 different flag to predict how the code would behave in different situations. Sometimes 100s too! I am not exaggerating.

> The only reason why this product is still surviving and still works is due to literally millions of tests!

4 comments

> The only reason why this product is still surviving and still works is due to literally millions of tests!

Personally I consider this a good thing. It's a sign of a really mature codebase where lots of edge cases are known + accounted for.

Even if the underlying code was really well written, simply the number of edge cases hamstring any "quick hacks".

Complex, runs reliably, easy to hack - Pick two

Tests are great, but relying on them in this way is like relying on a net to catch you without wearing a harness. It's a good thing if your last line of defense is reliable enough to catch you. But if you're relying on it, it's not a last line of defense, it's the only one.

You should be able to work on software because you understand how it works and what the ramifications of a given change are. Tests and code reviews provide redundancy. But here, they aren't providing redundancy, they're bearing the load.

What provides redundancy if tests are missing, broken, or misinterpreted? Have you ever fixed a bug, gone to write a test for it - and found the test already exists but passed spuriously?

In that sort of codebase, the only thing scarier than changing a line of code and breaking thousands of tests, is changing a line of code and not breaking any tests.
bloodcurdling
I think it is not RDBMS, rather combinatorial explosion of configuration/flag/option/platform is insidious and we, software engineering as a field, don't know how to do it well.

I think it is one of the largest impact problem in software engineering if it can be improved. Maybe a way to restrict flag interaction and reduce support and test matrix as a result.

Something I enjoyed was listening to the talks by the LibreSSL team cleaning up the mess in OpenSSL that (in part) caused the Heartbleed bug.

One of their strategies was to drop the macro soup and simply program against the "libc we would like to have", and then add compatibility shims to materialise their ideal libc instead of conditional compilation at the point of use.

I suspect that an even bigger cause of the brittleness described in TFA is also that an RDBMS inherently has to deal with concurrency. And not in the way that most applications do - the RDBMS is where other applications push their hairy concurrency problems into.