Hacker News new | ask | show | jobs
by xkapastel 1231 days ago
CLIP Interrogator uses BLIP, an image captioning model, as well as trying a bunch of prompts with CLIP. I guess you mean that this model uses the captioning model to generate the complete prompt? Is the code for this one available?
1 comments

Ah yes, this model treats this purely as image captioning. The model isn't open source yet.