|
|
|
|
|
by jandrewrogers
2749 days ago
|
|
There are a couple more reasons, all related to loss of optimization that can offset any nominal price-performance gains. Counterintuitively, people who are the most sensitive to performance often have the most to lose. AMD implements some important scalar instruction set extensions as microcode, not in silicon, so if you have an application that uses them heavily (and some of these instructions are significant optimizations over generic C code) you will see a drop-off in performance. Highly optimized/efficient code for Intel microarchitectures become a lot less so on the significantly different AMD microarchitecture. The effects are not small and re-optimizing for a different microarchitecture can be a lot of work depending on the application. |
|
Do you have any examples other than pdep and pext? Although these happen to be my two favorite scalar instructions, I would hesitate to call them important. Compilers won't just generate these from normal source [1], and I would call their use extremely niche at the moment (things like chess engines, I'm looking at you). They aren't even available on Intel Ivy Bridge and Sandy Bridge machines, which still make up a big enough fraction of data center machines.
So I'm pretty sure the number of entities avoiding switching to AMD because of heavy pdep and pext use is pretty close to zero.
Maybe you have some other instructions in mind though?
> Highly optimized/efficient code for Intel microarchitectures become a lot less so on the significantly different AMD microarchitecture.
This was somewhat true in the past, and probably hit its peak in the P4 vs Athlon/Opteron era. However, it is pretty much incorrect for Zen. Although the details of the hardware implementation might differ (and unless you are an insider you can mostly only guess at this), as an optimization target for software, Zen is very similar. It has a similar width, similar cache design both for data and instructions, similar instruction latencies and throughput, and so on. In fact something like Zen is as similar to Haswell as Haswell is to say Ivy Bridge.
The primary exception is AVX/AVX2 code, where Zen implements everything internally as 128-bit operations. In this area you might make some different decisions if targeting Zen - but the gap is not huge.
---
[1] What I mean is they won't generate them any scenario other than directly calling the x86-specific builtin/intrinsic for that exact instruction.