|
|
|
|
|
by dandiep
1236 days ago
|
|
A couple points which I don't see elsewhere: 1) They have the best quality model. Better quality means more users. More users means more data. Which means higher quality... 2) operationalizing & scaling these these models is non-trivial. I'm not sure what the state of distillation/pruning is for GPT-3, but I imagine they have figured out some proprietary techniques. 3) It's not just publishing a single model, but making it so people can fine tune and push their own. Because they've gotten good at 2, now anyone can create their own version of GPT customized for their use case. Will Google or others be able to do the same eventually? Definitely. The point I'm more making is that it's not just training the model and running it. |
|
Specifically, training data is not primarily coming from interactions with model. While with RLHF this data might become more important, it is still a very small portion.