Y
Hacker News
new
|
ask
|
show
|
jobs
by
fogx
1395 days ago
image-to-text models for captioning already exist. The most common one is CLIP from openAI.
https://openai.com/blog/clip/
Jina AI has an out-of-the-box implementation for it
https://clip-as-service.jina.ai/