| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simjnd 57 days ago
	For TurboQuant on model weights AFAIK it's currently a single person effort [1]. It needs his fork of llama.cpp, hasn't been upstreamed. He publishes his quantizations on HuggingFace but I'm not sure if he open-sourced the quantization pipeline. [1]: https://x.com/coffeecup2020