| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alex7o 50 days ago
	Because when you pay for a subscription they don't silently quantize the model a few week after release, and you can no longer get the full model running. Otherwise no need for full fp16, int8 works 99% as well for half the mem, and the lower you go the more you start to pay for the quants. But int8 is super safe imo.