Hacker News new | ask | show | jobs
by sillysaurusx 993 days ago
I wouldn’t be surprised if they do an actual OCR pass for every input image and just pass in the raw text as a part of the prompt. That plus the embedding should work well.