Hacker News new | ask | show | jobs
by ritz_labringue 1640 days ago
Doesn't it cost hundreds of thousands of dollars just to train GPT-3 ? If so, that seems like a good reason to use a "managed" GPT-3.
1 comments

Yes, but they didn't release the model after training and you can't take your weights with you if you finetune their model.

GPT Neo was trained at similar expense, and they released the weights. Use that.

First part is correct, the second part is not. GPT Neo is a 2.7B param model, the largest GPT is 175B (they have various flavours, up to 175B). I appreciate the sentiment and what ElutherAI is doing with GPT Neo, but there is no open source equivlenet of the full GPT-3 available for the public to use. Hopefuly it's just a matter of time.
GPT-J is 6B and comes pretty close. Also practically I haven’t noticed a difference.

Keep in mind there are also closed source alternatives: for example, AI21’s Jurassic-1 models are comparable, cheaper, and technically larger (albeit somewhat comically, 178B instead of 175B parameters).

Thanks ! Didn't know that. Isn't it also very expensive to run ?