|
|
|
|
|
by sanketsarang
1715 days ago
|
|
On the same basis, it would also help if you could provide a comparison between GPUs commonly used for ML. Tesla k80, P100, T4, V100 and A100. How has the architecture evolved to make the A100 significantly faster? Is it just the 80GB RAM, or there is more to it from an architecture standpoint? |
|
Oh, very much so. By way more than an order of magnitude. For a deeper read, have a look at the "architecture white papers" for Kepler, Pascal, Volta/Turing, and Ampere:
https://duckduckgo.com/?t=ffab&q=NVIDIA+architecture+white+p...
or check out the archive of NVIDIA's parallel4all blog ... hmm, that's weird, it seems like they've retired it. They used to have really good blog posts explaining what's new in each architecture.
You could also have a look here:
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index....
for the table of various numeric sizes and limits which change with different architectures. But that's not a very useful resource in and of itself.