Hacker News new | ask | show | jobs
by bubblyworld 720 days ago
Changing the internals of a net is more likely to affect its training speed and ability to converge than its observable behaviour after training (in my experience), and in general mutations make things worse unless you have good reason to believe it'll make things better (e.g. residual layers in resnets).

(disclaimer: I'm not an ML expert, maybe this is just a me problem, but I find nets extremely sensitive to stuff like changing activations, adding normalisation, layer initialisation, layer sizes, all this stuff that seems kinda arbitrary to a non-experts like me)