|
|
|
|
|
by variadix
572 days ago
|
|
How much of this is because CUDA is designed for GPU execution and because the GPU ISA isn’t a stable interface? E.g. new GPU instructions can be utilized by new CUDA compilers for new hardware because the code wasn’t written to a specific ISA? Also, don’t people fine tune GPU kernels per architecture manually (either by hand or via automated optimizers that test combinations in the configuration space)? |
|
And the PTX to SASS compiler DOES a degree of automatic fine tuning between architectures. Nothing amazing or anything, but it's a minor speed boost that has made PTX just a easier 'assembly-like language' to build on top of.