Hacker News new | ask | show | jobs
by owenthejumper 538 days ago
I am skeptical of generic sparsification efforts. After all, companies like Neural Magic spent years trying to make it work, only to pivot to 'vLLM' engine and be sold to Red Hat
1 comments

Link shows this isn't sparsity as in inference speed, it's spare autoencoders, as in interpreting the features in an LLM (SAE anthropic as a search term will explain more)