Hacker News new | ask | show | jobs
by Const-me 2147 days ago
> Eigen is a super complicated, template-heavy library

I agree, but that feature allows to apply optimizations by specializing these templates.

Works for both micro-optimizations (in their pbroadcast4<__m256d> they do 4 loads, on many CPUs AVX2 can do better with a single load + shuffles) and replacing large parts of Eigen (I was able to improve performance of conjugate gradient solver by moving the sparse matrix into a SIMD-optimized structure).