| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mauricio 792 days ago
	22B params * 2 bytes (FP16) = 44GB just for the weights. Doesn't include KV cache and other things. When the model gets quantized to say 4bit ints, it'll be 22B params * 0.5 bytes = 11GB for example.