Hacker News new | ask | show | jobs
by noahl 2016 days ago
I'd be curious to know what the libraries are that allow this sort of database engine design. Are there any open source examples?
1 comments

I’ve never seen one in open source but I’ve written a couple over the years and seen a few others. It forces you to solve many difficult metaprogramming challenges elegantly so it is a good excuse to achieve some mastery. I’ve burned more hours than I care to admit trying to figure out how to achieve seemingly simple results. Also, until C++17, this was an exercise in masochism due to the limitations of the expressiveness and type inference, which is why so few people tried. Basically, there were a lot of unpleasant rough edges that went away with C++17 because the language wasn’t smart enough previously. C++20 will also be a big step forward for this, whenever the compilers become usable.

One thing to understand is that these libraries are highly opinionated about the abstract architectural model. It doesn’t make a lot of sense to mix components from user space and kernel space designs, for example, though both have advantages separately. It tends to be more along the lines of one abstract architectural model and enormous amounts of elasticity and flexibility within that model based on the data models, workloads, transaction semantics, and hardware you are targeting. You also still have to write a spec that makes sense from a database engineering standpoint.

At least for me, there are still significant parts of a database engine for which I haven’t built a metaprogramming scaffold. That is largely a matter of time and effort. Other parts I haven’t had to write much code for years but still get state-of-the-art implementation to spec.

Ultimately, I’m trying to automate away my job.

The problem with such fancy meta programming in C++, is that while it may be a thrill to program for a C++ wiz you may end up with making something that is completely unmaintainable, because nobody else can grasp what you wrote.

Compare to writing a database system in something like Go. Sure you make end up with 50% more code, but you could have anybody up and running reading and understanding the code within 3 days.

IBM have done studies of this and found that fancy code is not all that valuable. It ends up falling in disuse over time as people don't get it. I have seen my fair share of C++ code which simply had to be tossed because nobody at the company could understand what the previous whiz kid had written.

That’s the beauty of the evolution of C++. Metaprogramming has become increasingly maintainable, as making it a first-class capability of the language is a core focus of the people designing it. The learning curve is finally low enough that it is practical. I think it is fair to say that C++17 is the first version where that is true.

You can’t write a comparable database engine in Go, fundamentally. The language lacks features required for competitive performance. The code difference will be much more than 50% trying to get the most out of what Go is capable of in this domain.

The point of writing code this way isn’t to be clever or for a “thrill”, it objectively produces superior performance, reliability, and maintainability. Defects scale with the number of lines of code regardless of language. Type safe code gen is a powerful tool.

>The learning curve is finally low enough that it is practical. I think it is fair to say that C++17 is the first version where that is true.

What are the specific C++17 features that make this a reality?

Just look at the new features in C++17 that simplify the use of templates, like variadic templates, constexpr everywhere, automatic type detection working together with templates, etc. The last versions of C++ are mostly about making using templates easy, which allows programmers to operate at a different level than languages such as Go and Python.