Hacker News new | ask | show | jobs
by throwaway_pdp09 2172 days ago
The very wide AVX stuff with integer ops, like these from wiki:

- AVX-512 Byte and Word Instructions (BW) – extends AVX-512 to cover 8-bit and 16-bit integer operations[3]

- AVX-512 Integer Fused Multiply Add (IFMA) - fused multiply add of integers using 52-bit precision.

could be very useful. I could have done with those recently. They also don't (AFAIK) cause cpu scaling (polite term for downclocking). He may well be right with FP though.

2 comments

52 bit precision? typo?
52 bits is the size of the mantissa in an IEEE 754 double precision floating point
Sounds suspiciously like integers in the float mantissa
If he was right with FP, he'd know better than the business analysts at Intel. Instead, his opinion is based on what the market looked like thirty years ago.

Nine years ago, AMD tested the hypothesis that really more "cores" and higher integer throughput were all that was needed and that FP performance didn't matter. The resulting architecture (Bulldozer) was a near-fatal disaster. It didn't even work out in the datacenter, where you might expect that hypothesis to hold.

AMD is currently giving intel great pain. So much for business analysts at Intel.
If Intel had their shit together they would have released AVX512 years ago with Skylake desktop, but they prefer to artificially segment the market, and have still not managed to release a desktop chip with AVX512--allowing AMD to catch up and now in many ways surpass them.
The only pain I see is they having won the CPU for game consoles.

All our laptops have Intel stickers on them and I doubt AMD is winning crazy dollars on cloud deployments.

AMD is doing well in the CPU market today _because_ they reversed course from the Bulldozer-based architectures.
More than that, Bulldozer didn't even have good single thread integer performance. What it gave you was 8 cores that might be able to keep up with 4 of Intel's cores on something that has 8 threads. The market was not particularly interested in this, especially since at the time even fewer things could actually use 8 threads than they do now.
Bulldozer significantly outperformed Sandy Bridge on the workloads which it was designed to be good at, which is multi-threaded integer workloads, like compiling the Linux kernel.

https://www.phoronix.com/scan.php?page=article&item=amd_fx81...

https://www.phoronix.com/scan.php?page=article&item=amd_fx83...

If Linus' attitude of "I'd rather have more cores" and "FP doesn't really matter" were representative of market demand, you'd have expected Bulldozer to do well at least somewhere, as opposed to nowhere.

Are we looking at the same benchmarks? In the first they're comparing an 8-core Bulldozer to Sandy Bridge with 4 cores and no hyperthreading and it's basically even, sometimes it wins by a small margin on the threaded ones. In the second the 3770K has 4 cores with hyperthreading and that makes it look even worse.

If they were actually getting twice the integer performance per module as Intel was getting per core then it might've been interesting, but being the same or only slightly better when comparing modules to cores wasn't enough to overcome the single thread performance deficit which people still care about a lot.