Hacker News new | ask | show | jobs
by k2052 1190 days ago
You are right that my method differed slightly so I did things again. It took me one try to find a sequence of moves that "breaks" what is claimed. You just have to make odd patterns of moves and it clearly has no understanding of the position.

Here is the convo:

me: You are a chess grandmaster playing as black and your goal is to win in as few moves as possible. I will give you the move sequence, and you will return your next move. No explanation needed

ChatGPT: Alright, I'm ready to play! Please give me the move sequence.

me: 1. e3 Nf6 2. f4 d6 3. e4

ChatGPT: My next move as black would be 3... e5

Completely ignoring the hanging pawn.This is not the play of a 1400 elo player. It is the play of something predicting patterns.

I ran a bunch of experiments in the past where I played normal moves and ChatGPT does respond extraordinarily well. With the right prompts and sequences you can get it to play like a strong grandmaster. But it is a "trick" you are getting it to perform by choosing good data and prompts. It is impressive but it is not doing what is claimed by the article.

2 comments

I'll add in as someone new to chess (~800 ELO):

ChatGPT is in no way 1400, or even close to it. The fact this article gets upvoted around here is proof that people aren't thinking clearly about this stuff. It's trivially easy to prove it wrong. Live unbelievably so, I tried the same prompt and within 12 moves it made multiple ridiculous errors I never would, and then an illegal move.

Keep in mind a 1400 level player would need to basically make 0 mistakes that bad in a typical game, and further would need to play 30-50 moves in that fashion, with the final moves being some of the most important and hard to do. There's just no way it's even close, my guess would be even if you correct it's many errors, it's something like ~200 ELO. Pure FUD.

The author of this article is cashing in the hype and I'm wondering how they even got the results they did.

They probably got them. The problem is that it's difficult to repeat, thanks to temperature, meaning users will get a random spread of outcomes. Today, someone got a legal game. Tomorrow, someone might get a grandmaster level game. But then everyone else trying to repeat or leverage this ends up with worse luck and gets illegal moves or, if they're lucky, moves that make sense in a limited context (such as related to specific gambits etc) but have no role in longer-term play.
With the big caveat that I'm not into chess, but I have heard that higher level play is extremely pattern based. Seems like ChatGPT would work well as long as you stick to patterns that people have studied and documented. Less optimal play would be more random and thus break from the patterns ChatGPT would have picked up from its training corpus.
Criticisms like this are exactly how the model will grow multimodal support for chess moves.

Keep poking it and criticizing it. Microsoft and OpenAI are on HN and they're listening. They'd find nothing more salient to tout full chess support in their next release or press conference.

With zero effort the thing understands uber domain specific chess notation and the human prompt to play a game. To think it stops here is wild.

People are hyping it because they want to get involved. They want to see the crazy and exciting future this leads to.

I doubt they'll pursue this. There is no advantage to it. ChatGPT will never beat Stockfish, and Stockfish would do it on a ludicrously small fraction of the resources. It would send the wrong message.

Some future AI might, but a language model won't.

My uber-obscure question that guaranteed a confident hallucination got fixed in the next update after I mentioned it. Probably just a coincidence.