| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hasperdi 126 days ago
	Why distill, if you can run the full model yourself... or at other inference providers. Quantization the better approach in most cases, unless you want to for instance create hybrid models ie. distilling from here and there.