|
|
|
|
|
by kevmo314
19 days ago
|
|
> This is not the first time we can see Nvidia taking shortcuts to achieve maximum performance of their GPUs Why is implementing it correctly not performant? For context I have no idea how rounding is typically implemented anyways. |
|
NVIDIA is not responsible alone, because the Microsoft DirectX specification includes the non-standard behavior.
Nevertheless, as shown in TFA, both the AMD and Intel GPUs allow the user to choose between correct behavior and incorrect behavior that might be faster, while NVIDIA ignores what the user requests and implements only the non-standard behavior.
The developers of graphics or ML/AI applications do not care about errors, but there are also people who want to use GPUs for normal computations, where the accuracy of the results matters, so they want to be able to choose between correct behavior and incorrect but faster behavior.
Actually "faster" is a misnomer, because denormals can be handled correctly without diminishing the speed, but that costs additional die area. Thus what NVIDIA gains by not implementing the right behavior is a reduced production cost.