| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by eightysixfour 1246 days ago
	https://github.com/facebookresearch/llama/blob/main/FAQ.md#3 Looks like it needs 14gb for weights and it isn't clear what the minimum size for the decoding cache is, but it defaults to settings for 30gb GPUs.

1 comments

In int8 7B needs only 9GB of VRAM and 13B needs only 20GB on a single GPU. https://github.com/oobabooga/text-generation-webui/issues/14...