|
|
|
|
|
by vilya
3692 days ago
|
|
In CUDA you can write your device kernel as a templated function and the compiler will specialise it based on how it's called by the host. In some cases this can result in big speed-ups while keeping the code simple and maintainable. You could get the same speed from specialising the code by hand, or with a separate code generation tool; but having that ability built in to the compiler makes it very easy to use. It's also handy for the compiler to be able to check for errors in shader invocations. Looking up parameters by name and setting them dynamically adds a whole class of potential runtime errors that don't exist in straight c++ programs. |
|
As to runtime errors because of misspelling a parameter name, this is going to be fixed the first time you run the program and you live happily ever after; I don't mind these errors in Python very much and I don't mind it in C programs using strings to refer to variable names in whatever context that may happen.
Overall the amount of "dynamism" in CPU/GPU programming IMO does not result in wasting almost any optimization opportunities nor does it make the program significantly harder to get right, but I'm prepared to change my mind given counterexamples (certainly in Python it's really easy to demonstrate missed optimization opportunities relatively to C due to its dynamism... The amount of dynamism in CPU/GPU programming however is IMO trivial and hence the issues resulting from it are also rather trivial.)