|
This is really good, and I was really excited by it but then I read: > running on a single 8XA100 40GB node in 38 hours of training This is a $40-80k machine. Not a diss, but I would love to see an advance that would allow anyone with a high end computer to be able to improve on this model. Before that happens this whole field is going to be owned by big corporations. |
If training a large model now costs the same as driving to visit grandma, that seems like a pretty good deal.