Hacker News new | ask | show | jobs
by apocalypses 2055 days ago
One thing LLVM bitcode still can’t do is retain information about preprocessor directives, eg any platform specific code for AVX2 vs SSE4 etc. So unless you aim to write intrinsic free code it’s usually less performance/reliable to rely on compiler automatic vectorisation, which results in worse codegen overall.
2 comments

> One thing LLVM bitcode still can’t do is retain information about preprocessor directives, eg any platform specific code for AVX2 vs SSE4 etc.

LLVM supports per-function subtarget attributes, so you can compile individual functions with AVX2 support versus SSE4 support. The clang frontend even has a few different ways of triggering this support, with one method allowing you to specify per-CPU dispatch, and another merely specifying target attributes on a per-function basis.

I know GCC supports generating multiple versions of a function, compiled for different instruction set extensions. And this can also be done manually when you have hand-optimized implementations: https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Common-Functio... It's based on a GNU extension to the ELF format, not preprocessor directives.

I don't think any of that would conflict with using bitcode for everything you don't have a hand-optimized machine-specific implementation of.