Hacker News new | ask | show | jobs
by gpapilion 684 days ago
Mishandling aside, the issue I've seen is there really isn't consumer demand for this. Prior to AMD having AVX512, most of the comments were around wasting the silicon on SIMD, rather than improving other aspects of the CPU. I'm pretty sure there was good reason to think it was largely a dark area of the chip.

From what I've seen, but haven't heard discussed much, the naive implementation vs AVX512 is a huge gain, but AVX2 vs AVX512 was not very impressive for the application I was looking at. The complexity this code added, and the cases where we needed it to run on AMD (for other reasons), basically made taking advantage of the feature undesirable for a single digit gain.

Things like VNNI or AMX are better wins, but they are only needed in very specific cases. VNNI in particular looked to be a 30% improvement in a BERT workload.

1 comments

Isn't it a bit weird to expect consumer demand for CPU instruction set extensions?

Obviously there's very little of that, but what should matter is the developer uptake and thus better end-user experience that can be delivered? (I'd also hope for even better autovectorization in compilers.)

It's in my opinion kind-of insane that we're still building so much software for ancient baselines and leaving quite a bit of performance on the table across the entire system. (How much has Apple won in terms of performance by forcing everyone to build for new ARM targets using new toolchains?)

Consumers want faster processing the instructions are just the method to get there. And they aren’t the best since the area dedicated to the instruction could be used for something else.

It is insane especially if you think emulation is performant enough to allow for a switch.