Hacker News new | ask | show | jobs
by roywiggins 979 days ago
It's a bit weird that they can't even avoid this when it comes to images; GPT shouldn't really be obeying instructions from images at all! I wonder if it's just OCRing images and concatenating that into the prompt...
2 comments

It's much more sophisticated than just OCR. The model was trained on images and text at the same time - it isn't processing images in a separate step.

The GPT-4 paper has a bunch more about this.

Not really, I suppose; it's just a different type of prompt. The algorithm does not "know" what it is fed. Data is data.