Hacker News new | ask | show | jobs
by pmontra 528 days ago
> The only way this trend reverses, is if compute becomes so cheap and ubiquitous, that everyone can achieve the necessary scale.

We would still need the 100 M+ images with accurate labels. That work can be performed collectively and open sourced but it must be maintained etc. I don't think it will be easy.

2 comments

DinoV2 is an unsupervised model. It learns both a high quality global image representation and local representations with no labels. It's becoming strikingly clear that foundation models are the go to choice for common data types of natural images, text, video, and audio. The labels are effectively free, the hard part now is extracting quality from massive datasets.
The other way it can reverse is discovering better methods to train models, or fine-tune existing ones with LoRA or whatever.

How did Chinese companies do it, is it a fabricated claim? https://slashdot.org/story/24/12/27/0420235/chinese-firm-tra...