Hacker News new | ask | show | jobs
by Athas 2065 days ago
I'm not sure I understand the bounds checking example. Can someone shed some light on how CUDA.jl does bounds checking? It's not entirely straightforward to do on a GPU.
1 comments

Since we use fat array objects, and not raw pointers, we know the size of the array and can perform bounds checks at run time. We then have a mechanism to throw an exception and signal it to the CPU to display it there. That's obviously quite expensive, so you can disable it with that annotation (the Julia debug setting also controls the granularity, and thus how expensive the exception handling is). It's fairly primitive, i.e. no full-featured exception handling (for now), but has proven very useful already.
How do you terminate the CUDA kernel when a bounds violation is encountered by a single thread? I don't think the CUDA API exposes a mechanism to do that safely.
You can emit `trap` or `exit` in the PTX code (although that has exposed many bugs in the PTX assembler because it does not expect that kind of often divergent control flow). But even if you'd just have the kernel return and otherwise produce invalid results, the fact that you can then report a bounds error instead of silently corrupting data and/or generating a fatal ERROR_ILLEGAL_ACCESS (requiring an application restart) is a significant usability improvement.