Y
Hacker News
new
|
ask
|
show
|
jobs
by
civilitty
1106 days ago
People training AI were already using CommonCrawl. There’s too many data sources to figure out each API. Everyone just downloads CC from AWS.