Hacker News new | ask | show | jobs
by theGnuMe 1281 days ago
It's certainly a potential path. It's folding at home model + crypto. May offset electricity costs if you can sell the tokens...

I am interested in understanding why more training flops vs say more parameters. Surely bigger models perform better but perhaps there is a limit?

I don't quite understand how distributed subtrees will work.

1 comments

I'm not an expert in ML theory, but from my perspective, I think more compute would cause better fitting for a model, but possibly overfitting if more parameters are not added. For some problem space, adding additional parameters introduces another degree of freedom, which should allow a larger domain for the inputs mapping to outputs (more answers to questions). And if we define AGI as a network that can answer questions for n > 2 domains (e.g. can we do image classification and a chat bot and synthesize them into a coherent system passes a Turing test), then more parameters makes sense to increase the range of outputs.

Interestingly, I don't think it's clear on how parameters in the network and compute would clearly find a domain within a combined problem space, where the mapping from question to answer will give sensible results. It seems like we need more tools to extend ML.

We know more parameters improves generalization so that is why it's unclear to me what they mean by more training flops.
One new parameter = 1 new node on the crpto nutter's network to do additional compute for the parameter? Lol I can see many ways these crypto nutters can attempt to weakly justify purchasing into their network.