| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by version_five 1326 days ago
	I'd guess there is a bias-variance tradeoff. If you just want to make a certain kind of image, no doubt a manually labeled and curated dataset can be better. If you want a generic generative model that has learned a wide variety of stuff, scale wins. I can see LAION playing a similar role to imagenet. The main application of imagenet isn't directly training image recognition models. It's pertaining on diverse data so that a "big" (big in 2016) model can be fine tuned easily on a small dataset, after learning to be a good feature extractor. From that perspective, the label quality (and concerns about bias and whatnot) are almost irrelevant