Hacker News new | ask | show | jobs
by theGnuMe 1281 days ago
We know more parameters improves generalization so that is why it's unclear to me what they mean by more training flops.
1 comments

One new parameter = 1 new node on the crpto nutter's network to do additional compute for the parameter? Lol I can see many ways these crypto nutters can attempt to weakly justify purchasing into their network.