Hacker News new | ask | show | jobs
by Pannoniae 108 days ago
The future of x86 is worrying but it's nowhere dead yet. I saw the C&C article yesterday and did some research, TL;DR:

- Apple took over the single-threaded crown a while ago.

- ARM also caught up in integer workloads.

- ARM Cortex is still behind in floating-point.

- Both are behind in multithreaded performance. (mostly because there are more high-end x86 systems...)

- Both are way behind in SIMD/HPC workloads. (ARM is generally stuck on 128-wide, x86 is 256-wide on Intel and 512-wide on AMD. Intel will return to 512-wide on the consumer segment too)

- ARM generally have way bigger L1 caches, mostly due to the larger pagesize, which is a significant architectural advantage.

- ARM is reaching these feats with ~4.5Ghz clocks compared to the ~5.5Ghz clocks on x86. (very rough approximation)

Overall, troubling for x86 for the future... it's an open question whether it will go the way of IBM POWER, legacy support with strict compatibility but no new workloads at all, or if it will keep adapting and evolving for the future.

3 comments

The performance/watt delta for M1 over contemporary x86 is massively larger than M5 vs Panther Lake. M5 and Panther Lake are roughly comparable.

So by that measure the future of x86 seems to be less troubling today than it was 5 years ago.

ARM CPUs are quite good in "general-purpose" applications, like Internet browsing and other things that do not have great computational requirements, as they mostly copy, move, search or compare things, with only few more demanding computations.

On the other hand, most ARM-based CPUs, even those of Apple, have quite poor performance for things like arithmetic operations with floating-point numbers or with big integer numbers. Geekbench results do not reflect at all the performance of such applications.

This is a serious problem for those who need computers for solving problems of scientific/technical/engineering computing.

During the half of century when IBM PC compatible computers have been dominant, even if the majority of the users never exploited the real computational power of their CPUs, buying a standard computer would automatically provide at a low price a good CPU for the "power" users that need such CPUs.

Now, with the consumer-oriented ARM-based CPUs that have been primarily designed for smartphones and laptops, and not for workstations and servers, such computers remain good for the majority of the users, but they are no longer good enough for those with more demanding applications.

I hope that Intel/AMD based computers will remain available for a long time, to be able to still buy computers with good performance per dollar, when taking into account their throughput for floating-point and big integer computations.

Otherwise, if only the kinds of computers made by Apple and Qualcomm would be available, users like me would have to buy workstations and servers with a many times lower performance per dollar than achievable with the desktop CPUs of today.

This kind of evolution already happened in GPUs, where a decade ago one could buy a cheap GPU like those bought by gamers, but which nevertheless also had excellent performance for scientific FP64 computing. Then such GPUs have disappeared and the gaming GPUs of today can no longer be used for such purposes, for which one would have to buy a "datacenter" GPU, but those cost an arm and a leg.

https://browser.geekbench.com/v6/cpu/15805010

I see x86 on top (the first valid result is 6841, which is x86), if that is the sole benchmark we're going to look at. You can further break that down into the individual tasks it performs, but I'm not going to :-)

> - ARM generally have way bigger L1 caches, mostly due to the larger pagesize, which is a significant architectural advantage.

Larger pages mean more potential for waste.

> https://browser.geekbench.com/v6/cpu/15805010

Not to bash on x86 or anything, but that's an outlier. Very overclocked with a compressor chiller or similar. Also the single-threaded and multi-threaded scores are the same; it's probably not stable at full load across all cores.

I don't think that's really representative of the architecture at scale, unless you're making the case for how overclockable (at great power/heat cost) x86 is.