Hacker News new | ask | show | jobs
by amasad 2357 days ago
This is amusing but doesn't really prove anything special about GPT-2 or general intelligence. You can probably get similar results with an n-gram model.
1 comments

Though this is not particularly strong, I don't think you would get similar strength from an n-gram model. You need longer-term correlations, which is generally where transformers do well.
Someone apparently did it with n-grams in 2015, and it reaches move 13 or so: https://twitter.com/kcimc/status/1214713412963291136

Someone else tried this with GPT-2 a few months ago on algebraic notation and their engine seems to get to move 40 without blundering: https://www.reddit.com/r/slatestarcodex/comments/el87vo/a_ve...

Board state + algebraic notation might be the trick to make a strong engine.