Hacker News new | ask | show | jobs
by telmop 206 days ago
Cool method. Pre deep learning there was plenty of interesting research on sparse methods. What do you think we're missing to have more widely used neural+sparse approaches?
1 comments

I think the lack of efficient GPU kernels was the main problem. It is much, much easier to get a real speedup and memory reduction from quantization from fp16 to fp8 than from 50% sparsity. For sparsity you needed structure (which makes your model worse) and special hardware support.