|
|
|
|
|
by lahwran
3053 days ago
|
|
Are you imagining this running inside a cuda kernel? If yes, the problem with doing anything of the kind is that, even if you run your kernel with one warp and mask all but one thread, you still then need round trips to the cpu to do io, and you need to dynamically load code, both of which the gpu is quite bad about. If no, it is probably doable if a bit hard to generate kernels on the fly. |
|