Hacker News new | ask | show | jobs
by jstimpfle 2117 days ago
It wasn't me who said "templates" first. (I don't think of them as a practical way to do compile time computation).

I was mostly looking at constexpr and whatever similar things have appeared in C++ lately. And I wanted to know about actual applications of them in the wild that make a difference.

Because, yeah I can precompute a 100K hash table or whatever at compile time, but I can also just do it at program startup (would anyone notice?) which is by far the simplest thing to do. Or I could just generate the data in a separate build step which is probably more hassle compared to constexpr but also probably friendlier in terms of build times in practice.

3 comments

In embedded systems, often times not all memory is equal. Pre-computing a lookup table at runtime may not be practical due to the limited amount of RAM vs. flash memory. A constexpr or template meta program is, as you touched on, a nice way to do calculations at compile time in the existing language without having to add an explicit autocoding build step. An explicit build step eventually makes sense for sufficiently complex algorithms, but it can be a lot of build system maintenance overhead for small to medium complexity stuff. Implementing esoteric code using obscure syntax may be bad for readability, but keeping it "in the language" has a benefit of limiting the amount of project specific knowledge required to understand it.
Now this I call a reasonable comment! I learned something, thank you.
Thanks! Thinking a bit more about it, I could imagine some performance impacts even on full featured CPUs. With virtual memory and the OS paging stuff in and out of physical RAM under memory pressure, read only data can be swapped out faster than writable data. The former, being immutable, can just be forgotten and then reloaded from disk when it's accessed again, while the latter has to be written to a swap file first, and writes are typically orders of magnitude slower than reads. Doesn't matter as long as you have plenty of RAM though.
Another benefit on that line is that read-only memory can be shared between processes.

I'm not sure that this is hugely relevant these days for small stuff, though. Like < 1MB... how many instances of the same program do we have running simultaneously, anyway?.

Matrix compile time templates like Eigen result in vastly faster code than doing it in C, since many operations can be compile time simplified. C has no way do do this at compile time.

This is just the tip of the iceberg on using templates and classes to make faster, cleaner code.

In C, you could provide a bunch of functions that chain together the permutations of operations that can be optimized. I.e. TransverseMultiplyMatrixInverseDotProduct() or whatever actually makes sense. Since you can't overload operators, folk would have to read through the available functions to find what they need anyway. It wouldn't be pretty, but it would be functional and probably compile down to similar machine code.
No, you cannot, not without putting an incredible amount of work on the programmers plate.

Consider the simple problem of multiplying together a sequence of N matrices of possibly different sizes with the least amount of work. The order you multiply in is determined via some optimization technique. You can try to have a different C function for each N, but eventually you will have some N for which your lib doesn't have the call. Or maybe you'll try to pack pointers into an array and pass that, which is now slower and more memory costly. In any case the order must be solved at runtime.

Templates allow, at compile time for known size matrices, the order to be determined. This cannot be done in generality with C since you cannot in C do it.

And, if the matrices were constexpr, this can be computed at compile time.

So the template method, giving you Turing complete operations, can do things that you cannot do in C.

This is just a simple example, the tip of the iceberg.

I'm not disputing that you can make prettier, more scalable APIs in C++ than in C. My point is that it's not completely hopeless in C either, though. In practice, the user of a matrix math library needs to understand the operations they're doing, and especially so if they actually care about performance. In the example you gave of a string of matrix multiplications, matrix multiplication isn't commutative, so the order is the order that the programmer wrote them in. The compiler is still free to reorder and coalesce redundant calculations with sufficient inlining. Also, N is small for 99% of use cases where performance matters, and when N is large, falling back to a slower "runtime" implementation is perfectly reasonable because the runtime overhead is insignificant compared to the overall cost of the operation; eigen itself does that internally. A blanket claim that pointers are "slower" and memory costly also seems a bit overly simplified. They are usually worse than passing by value for small data sizes, but for larger data sizes, some sort of reference passing somewhere will be faster than doing unnecessary memory copies. For sufficiently large data sizes, a straight forward hand written "runtime" algorithm implementation may even happen to be faster than a compiler generated specialized equivalent depending on the hardware's memory model.

Eigen is a great library and very convenient to use. It's great to be able to write straight forward chains of matrix operations and trust that the resulting program will be reasonably fast. There's no need to be dogmatic about C vs. C++, though. They're both higher level languages targeting the same underlying hardware. Templates enable library developers to make simple APIs at the expense of more complicated library implementations. In C, it's often necessary to compromise on the simplicity of the API to achieve the same performance, but it also generally means that the library implementations are simpler. The overall quality of the resulting binary can be about the same, and is almost certainly within the same ballpark performance wise. As an embedded engineer, I often need APIs that are compatible with C whether or not the implementation is C++, and I value simple library implementations over complex ones; the libraries and my use cases are often obscure enough that they are buggy, and so the more readable the library is, the easier it is for me to debug them.

As a recent real world example, a coworker, who is a wizard that knows way more than I do about signal processing, implemented some matrix heavy algorithms in a high level language that supports just-in-time compilation down to parallelized CPU and even GPU machine code. It worked great on an x86-64 workstation, but on production hardware, we struggled to get the code to run fast enough; it would peg all the CPUs at 100%. The many layers of libraries and JIT compiliation made the system very hard to debug even after a couple weeks of trying. I suggested re-implementing the algorithm in C++ using whatever matrix library was most convenient, and a few days later the system was running perfectly and averaging 14% of one CPU. The algorithm went from maybe 50 lines of very readable code to 250 lines of relatively ugly code, but we understood what it was doing way better. I believe he used Eigen in the C++ implementation, but whether or not the matrix library was optimized at all, C, C++, or rust, it still would have sipped around 14% of one CPU. My point is that, when performance matters, you need to understand what the software and hardware is doing, and so there's value in simplicity and pragmatism.

Of course you could also just write assembly if you wanted the exact most optimized machine code.

It’s also hard to generalize these things in the form of those kinds of macros. Whereas with something like Eigen, just write your code like normal, you don’t have to worry about the special cases, and the compiler rewrites it for you. That’s one of the nice benefits, one of many, of metaprogramming.

Indeed Eigen is a popular library for its expression templates that make many math operations much faster.
> It wasn't me who said "templates" first.

But you were talking about generics, which implied you were incorrectly conflating that with template meta-programming, since generics are done with templates.

> I don't think of them as a practical way to do compile time computation

Templates are however the most flexible, advance technique for C++ compile-time programming, though the C++ standard is evolving to bring more and more of those features into the language without using templates, i.e. "constexpr if".

My recommendation would be to avoid speculating on what the benefits are of a language or its features if you clearly don't have serious experience using them. It's fine if C++ is not for you.

> But you were talking about generics, which implied you were incorrectly conflating that with template meta-programming, since generics are done with templates.

I'm aware this is a pointless discussion, but please double check your claims are right if we are in "check mode". I did not say "generics".

>> Any examples of using compile time features that make a difference, instead of making code harder to maintian and increasing compile times significantly?

> My recommendation would be to avoid speculating on what the benefits are of a language or its features if you clearly don't have serious experience using them. It's fine if C++ is not for you.

Maybe you shouldn't make such statements if you don't know about my experience. I am speaking from experience, and exchanging subjective experiences isn't worthwhile most of the time, but sometimes (if people don't go down to personal attacks) there is a new viewpoint to find.

> I did not say "generics".

FYI if you’re talking about compile-time programming and you use the word “generic“ (as you did, in the context of a generic sort, which actually does have a meaning related to generics, as the routine works on containers of any type), note that this is a well-established term, which could be confusing if you actually are referring to something else.