| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vishvananda 2187 days ago
	we did a lot of our early experimentation with small networks. I don't think we went any smaller than 5 layers of 64 filters as we mentioned here: https://medium.com/oracledevs/lessons-from-alpha-zero-part-5...

1 comments

jonath_laurent 2187 days ago

And what were the results of these experiments? What error rate can you reach with the smallest network architecture you tried for example?

link

vishvananda 2187 days ago

Unfortunately I don't remember the exact numbers, but I think it was a couple percentage points worse than we were able to get with the large models.

link

jonath_laurent 2187 days ago

This is interesting, thanks! Is there anything else you can tell me about the results of your experiments with small networks? I am really interested in this.

For example: did you notice than increasing or decreasing network size required significant changes in other hyperparameters? Are small networks learning faster at the beginning of training before they start to plateau?

link