Hacker News new | ask | show | jobs
by oneplane 1571 days ago
I'm not sure if he was talking about the ML engine, the ARM cores, the microcode, the library or the OS. But it does indeed have FP16 in the Arm cores.
2 comments

FP16 is supported in M1 GPU's and Neural Engines through the CoreML framework. From https://coremltools.readme.io/docs/typed-execution :

> The Core ML runtime dynamically partitions the network graph into sections for the Apple Neural Engine (ANE), GPU, and CPU, and each unit executes its section of the network using its native type to maximize its performance and the model’s overall performance. The GPU and ANE use float 16 precision, and the CPU uses float 32.

Also, this exploration (https://tlkh.dev/benchmarking-the-apple-m1-max#heading-neura...) reports the 5.1-5.3 TFLOPS FP16 ballpark performance.

I should have been more clear. I didn't mean the hardware, but the speedup you get from using mixed precision in something like Tensorflow with an NVIDIA GPU.