|
|
|
|
|
by jhgb
1880 days ago
|
|
If I recall correctly, on a CPU, the penalty of doing "double-X" is something like 1:7 or so compared to just doing X. On most consumer GPUs, the penalty of doing doubles instead of singles would be a 1:24 or 1:32 these days, wouldn't it? So there should still be a fourfold speedup or so. Mixed binary operations with one single and one double-single should be cheaper, whenever applicable. As for "rolling your own", this is a compiler transformation, effectively. So it may depend on your workflow whether it's painful or not. |
|