| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dm319 3208 days ago
	There is a tendency for machine learning (including neural networks) to over-fit data - i.e. the algorithm learns to recognise the particular data, rather than the real distinguishing predictors of the groups. As you say, these can be features that are by chance associated with what you are trying to discriminate. This is why the model is validated on a separate testing group from the training group which created it. There are lots of ways to do this, and the more sophisticated continually iterate training and testing to improve the model.