Hacker News new | ask | show | jobs
by avatar042 1741 days ago
Thanks for the info on markets. What made you consider fine-tuning further on your own data? Was CLIP not sufficiently good enough to test the market?

FWIW I recall having seen something similar with Google Cloud's Video Intelligence API (https://towardsdatascience.com/building-an-ai-powered-search...). Building something generic would make it especially hard to get right, especially if your users want high precision-recall from their search results.

Re: licensing, the world of startups is somewhat of a wild-west these days with folks offering pre-trained models as-a-service without really thinking about the licensing implications (both on the dataset and model front). Huggingface is a classic example, and they seem to suggest that it's perfectly OK to fine-tune and use commercially (https://github.com/huggingface/transformers/issues/3357#issu...), but I'm not certain that their lawyers would put it the same way.

1 comments

Pre-trained CLIP gets you 95% of the way there, so you're correct, fine-tuning isn't necessary to test the market. The one downfall of pre-trained CLIP is that it hasn't been trained on still images from videos. These have a different noise characteristic and contain considerably more motion blur than your average image used for training.