Hacker News new | ask | show | jobs
by pacaro 3268 days ago
I've encountered academic code that implements the state of art, most efficient, algorithm for solving some problem.

The code that comes to mind had the following properties: over 20 years old; written in C and badly converted to C++ somewhere along the way (the stuff-all-the-globals-into-a-class approach); a combinatorial explosion of #define and #ifdef statements (covering all the experiments in the original paper)

In the paper, it is clear that one of the experiments wins, and why. So...

Step 1: remove all dead code.

Step 2: observe that the algorithm needs no dynamic memory allocation, remove all but 1 call to malloc, calloc, realloc, and free.

Step 3: the use of float can be replaced by correctly scaled 64-bit unsigned integers, with no loss of precision

Step 4: rewrite entirely in modern C++, this has two benefits, a) I get to use the <algorithms> library (judiciously, this simplifies the code enormously), and b) the code can send clearer messages to the compiler than the mid-90s liberal sprinkling of the 'register' keyword.

The net result is no asymptotic improvement whatsoever — arguably a slight improvement for very large N as heap performance starts to interfere, but nothing worth the effort.

However, the code now has tests (step 0), is clean and maintainable, is 10% of the size, and is 5-30x faster (depending on the shape of the data)