| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by munro 1491 days ago
	That /sounds/ right, but training still has a forward part, so OP does raise a really great question. And looking at the silicon, the neural engine is almost the size of the GPU. Really need someone educated in this area to chime in :)

2 comments

dgacmu 1491 days ago

You have to stash more information from the forward pass in order to calculate the gradients during backprop. You can't just naively use an inference accelerator as part of training - inference-only gets to discard intermediate activations immediately.

(Also, many inference accelerators use lower precision than you do when training)

There are tricks you can do to use inference to accelerate training, such as one we developed to focus on likely-poorly-performing examples: https://arxiv.org/abs/1910.00762

link

my123 1491 days ago

The neural engine is only exposed through a CoreML inference API.

You can't even poke the ANE hardware directly from a regular process. The interface for accessing the neural engine is not hardened (you can easily crash the machine from it).

So the matter is essentially moot in practice as you'd need your users to run with SIP off...

link

viraptor 1490 days ago

That doesn't seem to be a huge issue. If someone actually does this for income, would they avoid disabling sip for 2x performance gain for example?

link

munro 1489 days ago

Sounds like you've you done a bit of digging around, you're efforts are appreciated. I found and a github of people sharing what they know, here's a guy live streaming hacking it and building a tinygrad https://youtu.be/mwmke957ki4

link