| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by redmalang 3 days ago
	Try llama.cpp it seems to be a lot more performant and a lot more hackable. Also I'm surprised how substantial the impact of some of the inference configs (beyond just temp) can have, though this is much more model specific.