Hacker News new | ask | show | jobs
by Chu4eeno 3 hours ago
That's because of posttraining optimizing for benchmarks that test that.

They tend to collapse into nonsense and hallucinations pretty quickly if you move slightly out of the envelope of the current visual reasoning benchmaxxing.