Hacker News new | ask | show | jobs
by jeff_science 4710 days ago
CPU programming on the other hand, especially with OpenMP, is hard to get into at first, but once you have the right formula you can apply it pretty easily to most common tasks.

My guess is that you've spent a lot more time trying to understand CUDA and NVIDIA hardware. Or maybe all of your arrays are dimensions that are multiples of 16.