| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by helloericsf 730 days ago
	Seems interesting! https://github.com/turboderp/exllama "A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights."