Hacker News new | ask | show | jobs
by tlb 1207 days ago
It's trained on data crawled from the web, using CommonCrawl among other sources. See https://github.com/openai/following-instructions-human-feedb.... CommonCrawl certainly respects robots.txt, and I assume the other sources do too.