| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nsthorat 3122 days ago
	There is lots of work being done in model compression (quantization, simple factorization tricks, better conv kernels like depthwise separable convs, etc). We won’t let that happen!

1 comments

jorgemf 3122 days ago

I am aware of that research, but even with a 20x decrease in size some models are still too big for web (think about world wide web, not internet in US).

link

nsthorat 3122 days ago

Often times researchers train huge models, but don't think about model size (because they don't have to). We've seen ~200MB production models get down to ~4MB and not lose much precision. I'm quite confident we'll continue that trend.

Don't forget that folks were saying this about the web when images / rich media were becoming prevalent!

link

jorgemf 3122 days ago

200MB is still a small model and 4MB is almost the double of an average web page (including images). 10MB web pages is really bad, more for countries that are still developing their infrastructure.

link

niyazpk 3122 days ago

>> We've seen ~200MB production models get down to ~4MB and not lose much precision.

Details please. What techniques are used to reduce the model size?

link

hidenotslide 3122 days ago

I saw a talk on this paper a couple years ago. https://arxiv.org/abs/1503.02531 The method is to train a smaller model on the predictions of a large model or ensemble. I'd be interested in knowing other techniques as well.

link