| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alden5 1196 days ago
	i haven't noticed 4bit quantization affecting the quality of LLaMA-7B, it produces very coherent outputs, the trick is having a good example in your prompt so it has a good idea of what's expected of it.

1 comments

muttled 1195 days ago

Quality and quantity: I've had the best luck cramming a bunch of examples into the input, just like with GPT-J where you're only working with 6B parameters. Making sure the format stays consistent and ideally presented in the shape you'd encounter that same text if you found it on a webpage somewhere.

link