| > writing video codecs for years These people aren’t the only ones writing performance-critical SIMD code. I’ve been doing that for more than a decade now, even wrote articles on the subject like http://const.me/articles/simd/simd.pdf > that you use daily The video codecs I use daily are mostly implemented in hardware, not software. > it’s a lot faster (10-20%) Finally, believable numbers. Please note before this in this very thread you claimed “800% increase” which was totally incorrect. BTW, it’s often possible to rework source code and/or adjust compiler options to improve performance of the machine code generated from SIMD intrinsics, diminishing these 10-20% to something like 1-2%. Optimizations like that are obviously less reliable than using assembly, also relatively tricky to implement because compilers don’t expose enough knobs for low-level things like register allocation. However, the approach still takes much less time than writing assembly. And it’s often good enough for many practical applications. Examples of these applications include Windows software shipped in binaries, and HPC or embedded where you can rely not just on a specific compiler version, but even on specific processor revision and OS build. |
You cheery pick my comments and cannot be bothered reading.
We’re talking against fully optimized-autovec-all-llvm-options vs hand written asm. And yes, 800% is likely.
The 20% is intrinsics vs hand written.
> The video codecs I use daily are mostly implemented in hardware, not software.
Weirdly, I know a bit more about the transcoding pipelines of the video industry that you do. And it’s far from hardware decoding and encoding over there…
You know nothing about the subject you are talking about.