|
|
|
|
|
by Const-me
815 days ago
|
|
> show us your results Not GP but here’s an example where intrinsics outperformed assembly by an order of magnitude: https://news.ycombinator.com/item?id=36624240 They were AVX2 SIMD intrinsics versus scalar assembly, but I doubt AVX2 assembly gonna substantially improve performance of my C++. The compiler did a decent job allocating these vector registers and the assembly code is not too bad, not much to improve. It’s interesting how close your 800% to my 1000%. For this reason, I have a suspicion you tested the opposite, naïve C or C++ versus SIMD assembly. Or maybe you have tested automatically vectorized C or C++ code, automatic vectorizers often fail to deliver anything good. |
|
I think you're completely missing what are we talking about here.