|
|
|
|
|
by noosphr
442 days ago
|
|
People not in the field have no idea just how distorted the market is right now. I was working at a startup doing end to end training for modified BERT architectures and everything from buying a GPU - basically impossible right now, we ended up looking at sourcing franken cards _from_ China. To the power and heat removal - you need a large factories worth of power in the space of a small flat. To pre-training something that's not been pre-trained before - say hello to throwing out more than 80% of pretraining runs because of a novel architecture. Was designed to burn money as fast as possible. Without hugely deep pockets, with a contract from NVidia, and with a datacenter right next to a nuclear power plant you can't compete at the model level. |
|