Hacker News new | ask | show | jobs
by a-dub 2006 days ago
plus doesn't the M1 have specialized hardware. what's the neuro-engine or whatever it is that they call it for speeding up ML? i imagine at it's core it's a bunch of instructions for doing vector operations.

side bar: is there documentation for the instruction set or abi for that hardware?

3 comments

> side bar: is there documentation for the instruction set or abi for that hardware?

No. Apple stans, food for thought.

I would hope there will be something soon although it's Apple so not much.

Edit: Still basically no, but https://github.com/geohot/tinygrad/tree/master/ane has got the instruction format apparently.

You can attack an instruction set blind (https://recon.cx/2012/schedule/events/236.en.html).

More edit: Bingo, patent: https://patents.google.com/patent/US20190340491A1/en?oq=2019...

The Apple Neural Engine is separate from the CPU; it's not additional registers and instructions for the CPU, like a vector unit. You go through the Core ML framework to use it, just like you go through Metal or OpenGL to use the GPU.
The value of SIMD on a CPU these days is really a middle-ground where you value latency above throughput, so you probably would have the same trade off as getting the data to and from a GPU
i thought the M1 was a SoC with a unified memory architecture?
That definitely changes the calculus, but as I've mentioned in a different comment there doesn't seem to be literally any microarchitectural documentation to read, so I (don't own an M1) have nothing to go off unfortunately.

I'll make a wild guess that getting data to the neural engine is still probably not quick because I assume it's some kind of statically scheduled type affair (exposed pipeline?). We literally seem to know almost nothing about it sadly.

https://jobs.apple.com/en-gb/details/200205070/neural-engine...

Even the job listing gives away next to nothing, other than poor English "Knowledge in compiler is a plus" ;)