|
|
|
|
|
by jononor
669 days ago
|
|
That CLIP is not data / sample efficient is well know, and research to improve this is ongoing. Here is a 2021 paper which outperforms a CLIP baseline, with 7x less data. https://arxiv.org/abs/2110.05208
I am sure there are more recent papers also, possibly with larger gains.
I do not see why Adobe would not be able to make a good CLIP like model with 0.6 billion images. |
|
Unity and Epic have tried and failed to do so. There are lots of talented people out there at companies with lots of money. Adobe, Unity and Epic aren't the only ones with licensing bureau images either. And anyway, did you consider that the vast majority of content in licensing bureaus is garbage? Or that the captions are garbage? Or that maybe they have wildly overstated the number of images they have?
Adobe hasn't published anything about their architecture or approach for the simple reason that it is not clean in the way they advertise their models to be.