Hacker News new | ask | show | jobs
by imtringued 855 days ago
To be fair a lot of the GPU edge comes from fast memory. A GPU with 20tflops running a 30 billion parameter model has a compute budget of 700flops per parameter. Meanwhile the sheer size of the model prevents you from loading it more than 20 times from memory per second.