Hacker News new | ask | show | jobs
by Technotroll 972 days ago
Does the vision-language-model process raw image data, or does it process OCR character output?
1 comments

Gpt4v seems to be doing the former, at least in my experiments with it. It interprets plots and categorises images.