Hacker News new | ask | show | jobs
by godelski 1117 days ago
> GPT-4 never contained such tasks and data

No task, but we need to be clear that it did have the data. Remember that GPT4 was trained on a significant portion of the internet, which likely includes sites like Reddit and game fact websites. So there's a good chance GPT4 learned the tech tree and was trained on data about how to progress up that tree, including speed runner discussions. (also remember that as of March GPT4 is also trained on images, not just text)

What data it was trained on is very important and I'm not sure why we keep coming back to this issue. "GPT4 has no zero-shot data" should be as drilled into everyone's head as sayings like "correlation does not equate to causation" and "garbage in, garbage out". Maybe people do not know this data is on the internet? But I'm surprised if the average HN user thought that way.

This doesn't make the paper less valuable or meaningful. But it is more like watching a 10 year old who's read every chess book and played against computers beat (or do really well) against a skilled player vs a 10 year old who's never heard of chess beating a skilled player. Both are still impressive, one just seems like magic though and should raise suspicion.