| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bernaferrari 982 days ago
	even if that's true, the A series neural engine is much worse than the M series, I'll say it wrong, but it can only do 32bit inference (or something like that) where M series can do 64bit, so A series can run LLM but has a series of limitations that M series doesn't.

1 comments

icyfox 982 days ago

Practically speaking, most models today infer at 8bit or 16bit (sometimes, rarely 32). You don't see an empirical lift at more bits of precision. Size of the memory is far more important.

link

OJFord 982 days ago

If we're talking about the results, is there any reason to think it should make a difference at all?

link

icyfox 982 days ago

Sometimes gradients are small but meaningful, if you constrain them to too few bits / degrees of freedom they'll be unable to backprop successfully. This can hamper training and therefore results quality.

You can also think about it as compounding errors - at any one weight index the bit values might not be too meaningful, but cascaded over a lot of tensor multiplications they will be.

link

OJFord 982 days ago

Oh I was thinking we were talking about the same calculations on different hardware.

link