|
|
|
|
|
by LightMachine
762 days ago
|
|
It is an interpreter that runs on GPUs, and a compiler to native C and CUDA. We don't target SPIR-V directly, but aim to. Sadly, while the C compiler results in the expected speedups (3x-4x, and much more soon), the CUDA runtime didn't achieve substantial speedups, compared to the non-compiled version. I believe this is due to warp-divergence: with non-compiled procedures, we can actually merge all function calls into a single "generic" interpreted function expander that can be reduced by warp threads without divergence. We'll be researching this more extensively looking forward. |
|
Edit: nvm, I read through the rest of the codebase. I see that HVM compiles the inet to a large static term and then links against the runtime.
https://github.com/HigherOrderCO/HVM/blob/5de3e7ed8f1fcee6f2...
Will have to play around with this and look at the generated assembly, see how much of the runtime a modern c/cu compiler can inline.
Btw, nice code, very compact and clean, well-organized easy to read. Rooting for you!