Hacker News new | ask | show | jobs
by joennlae 946 days ago
Author here: Let me try to give an overview as I saw some questions repeating itself.

* This accelerator is for an Edge/Inference case, so there is no training on this chip.

* We introduce a differentiable form of Maddness, allowing Maddness to be used in e2e training and present an application -> ResNet.

* We are still in the process of understanding how this will translate to transformers.

* The goal was to show that Maddness is feasible with a good codesign of the hardware.

* Compared to other extreme quantisation (BNN/TNN) and pruning schemes, this is more general as it replaces the matmul with an approximate matmul.

* The model architecture is not fixed in hardware. It is „just“ a matmul unit.

I hope this helps :-)