| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hidenotslide 3116 days ago
	I saw a talk on this paper a couple years ago. https://arxiv.org/abs/1503.02531 The method is to train a smaller model on the predictions of a large model or ensemble. I'd be interested in knowing other techniques as well.