Hacker News new | ask | show | jobs
by SirWart 906 days ago
As noted in the quote, the CommonCrawl WARC files don’t contain images themselves, LAION used those files to find img tags and downloaded them themselves