|
|
|
|
|
by GaggiX
694 days ago
|
|
GPT-3 was 173B parameters and it's very bad compare to much smaller models we have nowadays, the data and the compute play a giant role, also I doubt you would need to train a model further after you have trained it on absolute everything (but we are very far from that). |
|