| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by touisteur 1559 days ago

AVX512 was clearly a great innovation in the vectorization landscape. A far cleaner instruction set, complete and symmetric, with very interesting blend, ternlog, lane-crossing instructions and the especially interesting mask registers. Lots and lots of goodies and an eye for compiler implementation.

I feel Intel failed hard at diffusion of the ISA (why not put it everywhere, with half-perf, it'll improve later, no change in code) and also at not pushing more energy/dollars into ispc. Yeah yeah your compiler engineers are clever, but you've been doing this for 20 years and autovectorization is still ways off. Let me write code in a way that can be easily vectorized. A subset of C. Less awkward than cuda.

Now it seems AVX512 and large vector units is dying and still is too niche. Sad.

1 comments

floatboth 1559 days ago

The cleanup being tied to the width increase was the first problem. The new width still being a fixed one was the second.

SVE is SIMD actually done right – on the Arm side in the near future, everything from smartphones to massive HPC boxes will be covered by the same clean SIMD ISA.

touisteur 1557 days ago

I agree it would have been nice to have 'infinite sized' instructions, chopped up to the actual underlying vector size. But there were so many complaints about AMD not implementing some instructions as 256 bit-wide but 2x128 that I feel they went for the least microcode route.

Mask registers offset the size problem a bit. I just wish we'd rebuild a language or clean libraries to take full advantage of this programming model. Is ispc still maintained? Does anyone use it in prod? Genuinely curious.

I feel SVE is 'too late' as most CPU makers seem to go back to smaller vector units (leaving the vectorized stuff to gpus - I know they're not the same thing, but if you're investing in heavy perf hardware, for repetitive computing...) and even Intel doesn't seem very serious about AVX512 except in the Xeon world. But then if you pay 8000EUR for a platinum thing, you might be able to pay for top talent to handcraft some intrinsics.