|
|
|
|
|
by arugulum
2001 days ago
|
|
570GB of Common Crawl post-filtering, but only 40% of CC data was seen even once during training, though CC is only 60% of the training data. You could work through the math to find the rough size of GPT-3's training data, but it sounds like The Pile is of comparable size. |
|