| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gnufx 1788 days ago
	In LLVM or in libomp? I don't know what omp simd is likely to get you over autovectorization. I know of cases where it was thought necessary (-fopenmp-simd, without -fopenmp) but wasn't with recent GCC.

1 comments

dragontamer 1788 days ago

Autovectorization has issues with function calls.

"#pragma omp declare simd" applies over a function call, which then allows that function to be used inside of a "#pragma omp for simd" loop.

A few keywords here and there really help the autovectorizer achieve closer to CUDA-like environments (like... actually having your SIMD code extend "through" a function call, so you can start splitting up the work a bit better).

EDIT: Here's an example from Intel's ICC: https://software.intel.com/content/www/us/en/develop/documen...

gnufx 1787 days ago

I took the example program from the OpenMP standard and built it with GCC 11 -Ofast. -fopt-info said the relevant loop was vectorized. Adding -fopenmp gave more vectorization messages from elsewhere, but I don't have time to figure out the difference from the tree dump (not being good with assembler). Doubtless the directives can help, but you do need to get them right, and I trust GCC more than me!