|
|
|
|
|
by joennlae
946 days ago
|
|
Author here:
Let me try to give an overview as I saw some questions repeating itself. * This accelerator is for an Edge/Inference case, so there is no training on this chip. * We introduce a differentiable form of Maddness, allowing Maddness to be used in e2e training and present an application -> ResNet. * We are still in the process of understanding how this will translate to transformers. * The goal was to show that Maddness is feasible with a good codesign of the hardware. * Compared to other extreme quantisation (BNN/TNN) and pruning schemes, this is more general as it replaces the matmul with an approximate matmul. * The model architecture is not fixed in hardware. It is „just“ a matmul unit. I hope this helps :-) |
|