Hacker News new | ask | show | jobs
by pjmlp 66 days ago
Yet there are gains of doing e.g. string searches with SIMD, which you naturally aren't going to do in CUDA.
1 comments

For sure, it makes sense for nice well defined problems that execute in isolation.

Think of the situation where the string search is running on a system that has hyper threading and a bunch of cores, and a normal amount of memory bandwidth.

It'll be faster, but at the same time make everything else worse if you overuse vector instructions.

(also cherry on top: some modern CPUs automagically lower the clock when they encounter vector instructions!!!)