Hacker News new | ask | show | jobs
by axg11 1740 days ago
Pre-trained CLIP gets you 95% of the way there, so you're correct, fine-tuning isn't necessary to test the market. The one downfall of pre-trained CLIP is that it hasn't been trained on still images from videos. These have a different noise characteristic and contain considerably more motion blur than your average image used for training.