Hacker News new | ask | show | jobs
by marcyb5st 1277 days ago
Last time I seriously checked (6 months ago or so) ROCm was still a far cry from CUDA. Set up was a mess, support was hit and miss, some operations were not particularly performante compared to the CUDA counterparts. Additionally, there are Tensorflow and probably PyTorch forks that should work with it, but they lag behind the official repositories quite a bit.

I hope that now that generative AI is becoming mainstream AMD steps up their game both on their consumer and professional lineups. If I were to buy a video card right now ( mostly for gaming+ML hobbies projects + running stable diffusion) I wouldn't pick AMD because I could do just 1/3 of my use cases properly without headaches (gaming).

1 comments

OpenCL works pretty well. Can't say I notice large gaps of performance between CUDA and openCL for my hpc work.
Thankfully for a good chunk of number crunching that works fine. But the other side of the coin is notably AI workloads. There's no OpenCL or Vulkan standard for exposing matrix units, only vendor specific ones.

For OpenCL: cl_qcom_ml_ops (Qualcomm) notably, for Vulkan: VK_NV_cooperative_matrix (NVIDIA)

Have you done any benchmarks with vulkan?
No I haven't used vulkan for compute.