| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ajayjain 2716 days ago
	In some recent work from my group [1], we reduce the complexity of keeping up with new SIMD ISAs by retargeting code between generations. For example, a compiler pass can take code written to target SSE2 (with intrinsics) and emit AVX-512 - it auto-vectorizes hand-vectorized code. With a more capable compiler, if the ISA grows in complexity, programmers and users of libraries get speedups without rewriting their code or relying on scalar auto-vectorization. However, the x86 ISA growth certainly pushed some complexity on us as compiler writers - we had to write a pass to retarget instructions! [1] https://www.nextgenvec.org/#revec

2 comments

jabl 2716 days ago

Recently a patch was contributed to gcc that converts mmx intrinsics to sse. Also the gcc power target supports x86 vector intrinsics, converting them to the power equivalents.

It's not as ambitious as your approach though, more like a 1:1 translation and thus cannot take advantage of wider vectors.

link

glangdale 2716 days ago

That patch primarily is there to avoid the pitfalls of MMX on modern architectures; it is gradually becoming deprecated. On SKX, operations that are available on both ports 0 and 1 for SSE or AVX are only available on port 0 for MMX. So code that uses MMX is getting half the throughput (which may or may not matter, but still).

link

jabl 2715 days ago

Thanks for the explanation, I wasn't aware of the reasoning behind it. I would guess by now all actively maintained performance-critical code has been rewritten in something more modern, so it certainly makes sense for Intel to minimize the number of gates they dedicate to MMX.

link

wmu 2716 days ago

Sorry for a non-constructive comment, just wanted to say your paper is great. :)

link

ajayjain 2712 days ago

Thank you! :)

link