Hacker News new | ask | show | jobs
by FrostAutomata 424 days ago
Interestingly, I've seen weaker models get a similar "riddle" right while a stronger one fails. It may be that the models need to be of a certain size to learn to overfit the riddles.