Hacker News new | ask | show | jobs
by gitaarik 620 days ago
Isn't it because this test has since been spread on the internet and the LLM's picked up on that so now they give the correct answer?

Maybe try a new unique logical question. And not the same question with a few words changed, because that might still match close to data the LLM already scanned.

1 comments

  the LLM's picked up on that so now they give the correct answer
The models don't just 'pick up' information that appears on the internet. They must be retrained with that new data in the training set.

I tested the models 4 days after the paper was published.

The models are retrained every few months, and the process takes much more than 4 days.