Ah okay, well it will still lose some of the time, which is surprising. And it will lose in surprising way, e.g., thinking for 14 seconds and then making an extremely basic mistake like not seeing it already have two on a row and could just win.
.. and you can "program" a neural network — so simple it can be implemented by boxes full of marbles and simple rules about how to interact with the boxes — to learn by playing tictactoe until it always plays perfect games. This is frequently chosen as a lesson in how neural network training even works.
But I have a different challenge for you: train a human to play tictactoe, but never allow them to see the game visually, even in examples. You have to train them to play only by spoken words.
Point being that tictactoe is a visual game and when you're only teaching a model to learn from the vast sea of stream-of-tokens (similar to stream-of-phonemes) language, visual games like this aren't going to be well covered in the training set, nor is it going to be easy to generalize to playing them.
Well whatever your story is, I know with near certainty that no amount of scaffolding is going to get you from an LLM that can't figure out tic-tac-toe (but will confidently make bad moves) to something that can replace a human in an economically important job.
- but tokens are not letters
- but humans fail too
- just wait, we are on an S curve to AGI
- but your prompt was incorrect
- but I tried and here it works
Meanwhile, their claims:
- LLMs are performing at PhD levels.
- AGI is around the corner
- humanity will be wiped out
- situational awareness report