| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sashank_1509 1202 days ago
	I think it’s fine you’re focused on retrieval but you should add that as a caveat to your results, 100 times better at retrieval. As an ML researcher in grad school here’s what >80% use case of clip I’ve seen: 1. Take a random image, take a random set of text (can just be categories separated by commas). CLIP will find the text that’s the best match to your image. CLIP is also incredibly robust at this, you can literally take an image with your phone and it will give you reasonable results. If you speed such a model up by 100X in inference or training, that would be a huge deal to the entire ML research community and you can expect some best paper awards (maybe even VC capital looking at stable diffusion) to come your way

1 comments

vov_or 1202 days ago

Hi! You are right that we had to clarify that "100 times better at retrieval". Btw, we have plans to tune models, evaluate, and publish results in different tasks (zero-shot ImageNet classification, etc)