Hacker News new | ask | show | jobs
by Baldbvrhunter 896 days ago
It would also mean learning Julia, but you can write GPU kernels in Julia and then compile for NVidia CUDA, AMD ROCm or IBM oneAPI.

https://juliagpu.org/

I've written CUDA kernels and I knew nothing about it going in.

1 comments

While I am a fan of Julia and its GPU module, using such an easy environment will really limit what you are able to learn. NVIDIA provides some great optimization tools (NSight Systems and NSight Compute) which help you optimize your kernel execution (fuse kernels, hide latency, use execution graphs) and CUDA code (take advantage of memory layout, use warp intrinsics, maximize throughput). These tools map to C++/CUDA source code and let you rapidly address bottlenecks - most of which may be on the host side.