Hacker News new | ask | show | jobs
by fogx 1395 days ago
image-to-text models for captioning already exist. The most common one is CLIP from openAI. https://openai.com/blog/clip/ Jina AI has an out-of-the-box implementation for it https://clip-as-service.jina.ai/