Hacker News new | ask | show | jobs
by taneq 1296 days ago
> The exact text of the questions, with the corresponding correct answers, almost certainly appeared in ChatGPT's training set.

This seems like it would be easy to check for, so I’m sure it will come to light fairly quickly if so?

1 comments

OpenAI doesn't release its training datasets, AFAIK, but we know they're based on sources like Common Crawl that scrape websites the same way search engines do. So here's an experiment you can try at home: type "juan purchased an" into Google Search and look at the auto-complete suggestions. If it suggests the word "antique", that's thanks to question 18 on the no-calculator math section of this exam. (Similarly if you type "jake buys" and it jumps in with "a bag of popcorn", that's question 3 on the calculator section.)