Hacker News new | ask | show | jobs
by rStar 1687 days ago
i’m hoping adobe has yet to fully optimize for the new architecture, and i’m also hoping they care to do so
1 comments

It could also be that x86 has better SIMD support by a considerable margin, which can make repetitive memory access/serialized workloads a lot faster. I'm not super well intimated with those extensions, but I know that the NEON SIMD implementation in ARMv8 leaves quite a bit to be desired. It's a tricky situation, and one I don't see resolving in a nice clean way. It's stuff like this that makes me hopeful for RISC-V though, where we could theoretically have our cake and eat it too, with dynamic instruction pipelines and incredibly low power usage. Only time will tell, I suppose.
Throughput is not an issue on the M1, with 4x 128-bit SIMD units.

Neon is certainly not a bad SIMD ISA, it's a quite orthogonal one.

You also have the AMX extension at hand, which is more special purpose but allow to deliver very high throughput. (on a regular M1: 350Gflops DGEMM, 1.2Tflops SGEMM, without leveraging anything other than the CPU)

NEON can work pretty well and has been used in real time tasks on mobile for a long time. Example operations:

https://github.com/thenifty/neon-guide/