Hacker News new | ask | show | jobs
by blueboo 972 days ago
We were always in this regime. In general, it has been far more effective to take a proven model and fine-tune it. It was true five years ago taking a convnet out of a model zoo to fine tune and it’s true now.

If you can achieve the moonshot of gathering, generating, annotating enough data with the right distribution to train from scratch, and you use a SOTA bag of tricks to regularise, you might do better.

Bear in mind fine tuning is literally just more pre-training. Starting with a trained model is like starting with an incredibly well-initialised network.

1 comments

Yes, true the fine-tuning is not new and indeed I also view it as "starting with an incredibly well-initialized network"

However, the promotable aspects of those vision models are completely new. You can define your tasks at runtime and steer the model behavior. I think this makes it easier and faster to insights from your images. Lastly, those models are trained on a lot of different tasks compared to previous models that were general classifiers and that could then be trained on a specific domain. This allows them for example to be reused in an organisation and prevents you from creating multiple task-specific models