| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dmichulke 82 days ago
	Forgive my ignorance but aren't they already on huggingface? I assumed turboquant optimizations are already everywhere - in llama-cpp, or the quantization machinery of unsloth and the likes.

1 comments

rapatel0 76 days ago

I forked it to also add rotorquant. This is a specific optimization that uses clifford rotors instead of static compile time random purmutation to store the activations. Reduces space and parameter count for the storage.

link