|
|
|
|
|
by rahimnathwani
2245 days ago
|
|
These are very big models, like 100x to 300x the # parameters of resnet-50. 2.7bn parameters (for the smaller model) means you have to do 2.7bn calculations for a single step of the model. You could fit the model in main memory, but how long is it going to take you to run all those calculations on a CPU? And the full model will need to run multiple times to output a single sentence. |
|