Hacker News new | ask | show | jobs
by twofornone 1714 days ago
I suspect because there's a lot of entropy in hair and because of the shape of the optimization function, (which might even have a spatial term) a regular pattern in such a noisy and hard to learn region falls into a local minimum while the rest of the image converges to the true minimum. There's a little meat left to optimize here, but you need to do it cleverly because there's no reason for a neural network to learn all the many combinations of hair pixels in this application. That could require as many parameters all the neurons involved in generating the faces, I'd bet.
1 comments

Thinking more about it, the shape of the solution space is sufficiently different for hair vs faces that any given combination of {optimization function, hyperparameters, training data} is unlikely to optimize for both. You probably need some other sort of special tuning, like a spatially local adaptive gradient for regions of hair.