| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by walrus 2274 days ago
	I'm just speculating (and haven't read the paper yet), but it may be possible to achieve similar speedups on GPUs by pruning the smallest 20% of blocks of size ≥K×K to produce block-sparse weights[0], rather than pruning the smallest 20% of weights. [0] https://openai.com/blog/block-sparse-gpu-kernels/