| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by neurobro 2315 days ago
	Not sure about bad labels, but semi-supervised learning is the term for training on data with a lot of missing labels. Essentially the algorithm makes predictions on the unlabeled data and uses its highest confidence predictions as additional training data. Generative models can also "dream up" entirely new training examples. There is a risk of amplifying the confidence in bad predictions, but it works well overall (better than using only the labeled portion of the data).