Hacker News new | ask | show | jobs
by gwern 1521 days ago
The data here is effectively free. I don't think they would exhaust The Pile, which you can download for free. This is also true for text2image models like DALL-E 2: while OA may have invested in its own datasets, everyone else can just download LAION-400M (or if they are really ambitious, LAION-5B https://laion.ai/laion-5b-a-new-era-of-open-large-scale-mult... ).