Hacker News new | ask | show | jobs
by DTolm 2042 days ago
FFT is an extremely bandwidth limited problem, so if most time is taken by one upload by both algorithms, the overall time will be similar. More in-depth analysis of how VkFFT and cuFFT scales with memory clocks and bandwidth can be found here: https://www.reddit.com/r/nvidia/comments/jxlbjs/rtx_3090_ove...

I don't know exactly what cuFFT does differently, but I am fairly certain they use very similar memory layout and algorithms behind their code (judging by execution times only).

What should be the main take from this is that Vulkan allows for similar in performance low-level memory control, while being cross platform and open source. I don't think that SPIR-V is more expressive - bet Nvidia wouldn't allow this. But it doesn't prohibit it from still being good.