| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by halflings 424 days ago
	That's what the chart says yes. 14.1GB VRAM usage for the 27B model.

1 comments

erichocean 424 days ago

That's the VRAM required just to load the model weights.

To actually use a model, you need a context window. Realistically, you'll want a 20GB GPU or larger, depending on how many tokens you need.

link

oezi 424 days ago

I didn't realize that the context would require such so much memory. Is this KV caches? It would seem like a big advantage if this memory requirement could be reduced.

link