Hacker News new | ask | show | jobs
by Slix 1396 days ago
Could training be crowd sourced among consumer GPUs like Folding at Home?
1 comments

Probably not for a few years, you need a (maybe few) A100(s) to be able to backprop a model that big with float32.
iirc, they tweeted about using around 3800 in parallel