| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by regularfry 1138 days ago
	Yes and no. Some of the optimisation techniques that are being researched at the moment use the output of larger models to fine-tune smaller ones, and that sort of improvement can obviously only be one-way. Same with quantising a model beyond the point where the network is trainable. But anything that helps smaller models run faster without appealing to properties of a bigger model that has to already exist? Absolutely yes.