Hacker News new | ask | show | jobs
by outlace 1458 days ago
I see it noted that this could speed up machine learning inference, but any hope of this being extended to also speed up training? I imagine with 100x speedup in matmuls, albeit approximate matmuls, one could plausibly train on a CPU.
1 comments

Yes. It's another research project to make this happen, but I think it would be fairly straightforward. The issue is that you can't backprop through the assignment step, so you get no gradient with respect to the input. This mandates a progressive layer freezing strategy. I don't think it would be too hard to get working though; you'd likely just need to train for longer, or start with a pretrained model and fine-tune it as you freeze + approximate the layers.