| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by aab0 3725 days ago
	Using sin(x) or the other input features like x^2 goes back to making it too easy, though. So far the best I can do is 7 layers of 7 which gets a loss of 0.02. 3x7 is almost cracking the Swiss Roll but can't quite finish it off and gets stuck at 0.05: https://imgur.com/Z3f2ECc ... Surprisingly, 2x8 can do it, as long as I have noise or regularization on, but 8/7 then seriously struggles. Is 16 neurons a critical limit here?

2 comments

teraflop 3725 days ago

I managed to get to 0.01 loss from only x1/x2, using 3 hidden layers, L1 regularization, a bit of added noise, and some patience: http://i.imgur.com/Y3zKpJF.png

link

aab0 3725 days ago

Yes, noise & regularization seem to be key here. I've gotten a 2-layer with 7/8 neurons down to 0.06 and dropping but only with noise & l1: http://playground.tensorflow.org/#activation=relu&regulariza... Final loss of 0.051. Interestingly, increasing noise from 10 to 15 destroys performance, loss of 0.47.

link

vanboxel 3725 days ago

Is it really "making it too easy" if you're applying your knowledge of the structure of the problem space to make it easier for the computer to solve? Certainly this isn't easy to do with every problem, but it seems like a better idea in general to start with parameters you suspect to be correct.

In the "swiss cake roll" the circular nature of the classes suggests using a sin or cos function, and the fact that they spiral out suggests also inputting magnitude information. Sure, you can just add more neurons that will end up computing the same thing, but we might as well give the computer a head start when we can.

link

Lerc 3725 days ago

I look at things like this as not "making it too easy", but rather, "time for a more difficult problem".

I'd quite like if you could define your own input patterns and data sets.

link