| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by capyba 224 days ago
	Interesting, how so? I’ve had really good success with the autovectorization in gcc and the intel c compiler. Often it’s faster than my own instrinsics, though not always. One notable example though is that it seems to struggle with reduction - when I’m updating large arrays ie `A[i] += a` the compiler struggles to use simd for this and I need to do it myself.

1 comments

There's no optimal portable `movemask` operation. Because aarch64 NEON doesn't have it.