Hacker News new | ask | show | jobs
by thesz 376 days ago
> the assertion that "VLMs don't actually see - they rely on memorized knowledge instead of visual analysis". If that were really true, there's no way they would have scored as high as 17%.

The ability to memorize leads to (some) generalization [1].

[1] https://proceedings.mlr.press/v80/chatterjee18a/chatterjee18...