Hacker News new | ask | show | jobs
by sulam 95 days ago
They have a _text_ model. There is some correlation between the text model and the world, but it’s loose and only because there’s a lot of text about the world. And of course robotics researchers are having to build world models, but these are far from general. If they had a real world model, I could tell them I want to play a game of chess and they would be able to remember where the pieces are from move to move.
1 comments

What makes you think that text is inherently a worse reflection of the world than light is?

All world models are lossy as fuck, by the way. I could give you a list of chess moves and force you to recover the complete board state from it, and you wouldn't fare that much better than an off the shelf LLM would. An LLM trained for it would kick ass though.

> I could give you a list of chess moves and force you to recover the complete board state from it, and you wouldn't fare that much better than an off the shelf LLM would

idk, I would expect anyone with an understanding of the rules of chess, and an understanding of whatever notation the moves are in, would be able to do it reasonably well? does that really sound so hard to you? people used to play correspondance chess. Heck I remember people doing it over email.

In comparison, current ai models start to completely lose the plot after 15 or so moves, pulling out third, fourth and fifth bishops, rooks etc from thin air, claiming checkmate erroneously etc, to the point its not possible to play a game with them in a coherent manner.

I would expect that off the shelf GPT-5.4 would be able to do it when prompted carefully, yes. Through reasoning - by playing every move step by step and updating the board one move at a time to arrive at a final board state.

On the other hand, recovering the full board state in a single forward pass? That takes some special training.

Same goes for meatbag chess. A correspondence chess aficionado might be able to take a glance at a list of moves and see the entire game unfold in his mind's eye. A casual player who only knows how to play chess at 600 ELO on a board that's in front of him would have to retrace every move carefully, and might make errors while at it.

Try to play a simple over the board style game with 5.4 with whatever notation you chose to use (or just descriptions, literally anything). Prediction: it will start out fine, but the mid game will be very hard to keep it on track, and the endgame will make you give up.
> What makes you think that text is inherently a worse reflection of the world than light is?

What does the color green look like?

A color without form can't look like anything.
It doesn't look like anything to me.
"What makes you think that text is inherently a worse reflection of the world than light is?"

Come on man, did you think before you asked that one :)?