|
|
|
|
|
by nickpsecurity
242 days ago
|
|
You're onto something. BabyLM competition had caps. Many LLM's were using 1TB training data for some time. In many cases, I can't even see how many GPU hours or what size cluster of what GPU's the pretraining required. If I can't afford it, then it doesn't matter what it achieved. What I can afford is what I have to choose from. |
|