|
|
|
|
|
by anxoo
430 days ago
|
|
"I set a plate on a table, and glass next to it. I set a marble on the plate. Then I pick up the marble, drop it in the glass. Then I turn the glass upside down and set it on the plate. Then, I pick up the glass and put it in the microwave. Where is the marble?" the author claims that visual reasoning will help the model solve this problem, noting that gpt-4o got the question right after making a mistake in the beginning of the response. i asked gpt-4o, claude 3.7, and gemini 2.5 pro experimental, who all answered 100% correctly. the author also demonstrates trying to do "visual reasoning" with gpt-4o, notes that the model got it wrong, then handwaves it away by saying the model wasn't trained for visual reasoning. "visual reasoning" is a tweet-worthy thought that the author completely fails to justify |
|