|
|
|
|
|
by actsasbuffoon
125 days ago
|
|
Yeah, spatial reasoning has been a weak spot for LLMs. I’m actually building a new code exercise for my company right now where the candidate is allowed to use any AI they want, but it involves spatial reasoning. I ran Opus 4.6 and Codex 5.3 (xhigh) on it and both came back with passable answers, but I was able to double the score doing it by hand. It’ll be interesting to see what happens if a candidate ever shows up and wants to use Deep Think. Might blow right through my exercise. |
|