| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sgtnoodle 2111 days ago

I'm not disputing that you can make prettier, more scalable APIs in C++ than in C. My point is that it's not completely hopeless in C either, though. In practice, the user of a matrix math library needs to understand the operations they're doing, and especially so if they actually care about performance. In the example you gave of a string of matrix multiplications, matrix multiplication isn't commutative, so the order is the order that the programmer wrote them in. The compiler is still free to reorder and coalesce redundant calculations with sufficient inlining. Also, N is small for 99% of use cases where performance matters, and when N is large, falling back to a slower "runtime" implementation is perfectly reasonable because the runtime overhead is insignificant compared to the overall cost of the operation; eigen itself does that internally. A blanket claim that pointers are "slower" and memory costly also seems a bit overly simplified. They are usually worse than passing by value for small data sizes, but for larger data sizes, some sort of reference passing somewhere will be faster than doing unnecessary memory copies. For sufficiently large data sizes, a straight forward hand written "runtime" algorithm implementation may even happen to be faster than a compiler generated specialized equivalent depending on the hardware's memory model.

Eigen is a great library and very convenient to use. It's great to be able to write straight forward chains of matrix operations and trust that the resulting program will be reasonably fast. There's no need to be dogmatic about C vs. C++, though. They're both higher level languages targeting the same underlying hardware. Templates enable library developers to make simple APIs at the expense of more complicated library implementations. In C, it's often necessary to compromise on the simplicity of the API to achieve the same performance, but it also generally means that the library implementations are simpler. The overall quality of the resulting binary can be about the same, and is almost certainly within the same ballpark performance wise. As an embedded engineer, I often need APIs that are compatible with C whether or not the implementation is C++, and I value simple library implementations over complex ones; the libraries and my use cases are often obscure enough that they are buggy, and so the more readable the library is, the easier it is for me to debug them.

As a recent real world example, a coworker, who is a wizard that knows way more than I do about signal processing, implemented some matrix heavy algorithms in a high level language that supports just-in-time compilation down to parallelized CPU and even GPU machine code. It worked great on an x86-64 workstation, but on production hardware, we struggled to get the code to run fast enough; it would peg all the CPUs at 100%. The many layers of libraries and JIT compiliation made the system very hard to debug even after a couple weeks of trying. I suggested re-implementing the algorithm in C++ using whatever matrix library was most convenient, and a few days later the system was running perfectly and averaging 14% of one CPU. The algorithm went from maybe 50 lines of very readable code to 250 lines of relatively ugly code, but we understood what it was doing way better. I believe he used Eigen in the C++ implementation, but whether or not the matrix library was optimized at all, C, C++, or rust, it still would have sipped around 14% of one CPU. My point is that, when performance matters, you need to understand what the software and hardware is doing, and so there's value in simplicity and pragmatism.