Hacker News new | ask | show | jobs
by cloudhan 897 days ago
I am long sought after a CUDA or HIP compiler that target SPIR-V or DXIL. So that we can compile all thoes neural network kernels to almost all compute devices.

The requirements are, 1. extend on C++17 this means template meta programming works so that cub or cutlass/cute works

2. AOT

3. and no shitshow!

The only thing that comes to close is circle[1].

- OpenCL is a no go, as it is purely C. And it is full of shitshow especially on Android devices. And the vendor drivers are the main source of shit, jit compile adds the other.

- Vulkan+GLSL is a no go. The degree of shitness is on par with OpenCL due to driver and jit compiler.

- slang[2] has the potential, but the meta programming part is not as strong as C++, existing libraries cannot be used.

The above conclusion is drawn from my work on OpenCL EP for onnxruntime. And it purely is a nightmare to work with thoes drivers and jit compilers. Hopefully Vcc can take compute shader more seriously.

[1]: https://www.circle-lang.org/

[2]: https://shader-slang.com/

[3]: https://github.com/microsoft/onnxruntime/tree/dev/opencl

6 comments

What about HLSL, specially since it is a kind of C++ flavour, specially after HLSL2021 improvements?

https://devblogs.microsoft.com/directx/opening-hlsl-planning...

https://devblogs.microsoft.com/directx/announcing-hlsl-2021/

At Vulkanised 2023 discussion round Khronos admited that they aren't going to improve GLSL any further, and ironicly rely on Microsoft's HLSL work as the main shader language to go alongside Vulkan.

Maybe there is something else discussed at Vulkanises 2024, but I doubt it.

There was some SYCL work to target Vulkan, but seems to have been a paper attempt and fizzled out.

https://dl.acm.org/doi/fullHtml/10.1145/3456669.3456683

At the time of the dev of the EP, the tooling is not that good as current. I have imagined a pipeline that HLSL compiles down to DXIL then go through spirv cross and then target wide variety of mobile devices with opencl runtime. But they are more focused on graphics part and is cannot work with the kernel execution model, not to even mention the structural control flow things, it definitely not going to work. The OpenGL does not work with the imagined pipeline, because IIRC it cannot consume SPIRV bytecode. Vulkan is so niche and discarded very early. The final result with opencl runtime and cl language worked but the driver is a mess [palmface]
> OpenGL does not work with the imagined pipeline, because IIRC it cannot consume SPIRV bytecode.

What gave you that idea?

  $ eglinfo -a gl -p wayland | grep spirv
  GL_ARB_get_program_binary, GL_ARB_get_texture_sub_image, GL_ARB_gl_spirv, 
  GL_ARB_sparse_texture_clamp, GL_ARB_spirv_extensions, 
  GL_ARB_get_program_binary, GL_ARB_get_texture_sub_image, GL_ARB_gl_spirv, 
  GL_ARB_spirv_extensions, GL_ARB_stencil_texturing, GL_ARB_sync,
There it is: <https://registry.khronos.org/OpenGL/extensions/ARB/ARB_gl_sp...>. Not in OpenGL ES, though.
> At Vulkanised 2023 discussion round Khronos admited that they aren't going to improve GLSL any further, and ironicly rely on Microsoft's HLSL work as the main shader language to go alongside Vulkan.

That sounds intriguing, but I haven't been able of finding any references to it(I guess it was discussed in the panel but the video of it is private) do you have any reference of more information into it?

is it related to adding hlsl support to clang?

> Khronos admited that they [...] ironicly rely on Microsoft's HLSL work as the main shader language to go alongside Vulkan.

So Cg ultimately prevailed over GLSL. Can't say that disappoints me.

It's still early days for Vcc, I outline the caveats in the landing page. While I'm confident the control-flow bits and whatnot will work robustly, there's a big open question when it comes to the fate of standard libraries, the likes of libstdc++ were not designed for this use-case.

We'll be working hard on it all the way to Vulkanized, if you have some applications you can get up and running by then, feel free to get in touch.

I think the driver ecosystem for Vulkan is rather high-quality but that's more my (biased!) opinion that something I have hard data on. The Mesa/NIR-based drivers in particular are very nice to work with!

Thoes "existing libraries" does not necessary mean stdc++, but some parallel primitive, and are essential to performance portability. For example, cub for scan and reduction, cutlass for dense linear algebra[1].

> I think the driver ecosystem for Vulkan is rather high-quality

Sorry, I meant OpenGL. At the time of evaluation, the market shared of vulkan on Android deivces is too small and been out of consideration at very early stage. I'd assume the state has changed a lot thereafter.

It is really good to see more projects take a shot on compiling C++ to GPU natively.

[1] cutlass itself is not portable, but the recently added cute is well portable as I evaluated. It provides a unified abstraction for hierarchical layout decomposition along with copy primitive and gemm primitive.

Will C++17 parallel algorithms be supported?

https://on-demand.gputechconf.com/supercomputing/2019/pdf/sc...

Edit: Nevermind, I think I have misunderstood the purpose of this project. I thought it was a CUDA competitor, but it seems like it is just a shading language compiler for graphics.

SYCL/DPC++ are the only viable CUDA competitors I would say, assuming that the tooling gets feature parity.
circle lang is also very worth to check out.
See https://github.com/google/clspv for an OpenCL implementation on Vulkan Compute. There are plenty of quirks involved because the two standards use different varieties of SPIR-V ("kernels" vs. "shaders") and provide different guarantees (Vulkan Compute doesn't care much about numerical accuracy). The Mesa folks are also looking into this as part of their RustiCL (a modern OpenCL implementation) and Zink (implementing OpenGL and perhaps OpenCL itself on Vulkan) projects.
chipStar (formerly CHIP-SPV) might also be worth checking out: https://github.com/CHIP-SPV/chipStar

It compiles CUDA/HIP C++ to SPIR-V that can run on top of OpenCL or Level Zero. (It does require OpenCL's compute flavored SPIR-V, instead of graphics flavored SPIR-V as seen in OpenGL or Vulkan. I also think it requires some OpenCL extensions that are currently exclusive to Intel NEO, but should on paper be coming to Mesa's rusticl implementation too.

GCC supports nvptx and amd via openmp offloading and openacc. I have no idea of how well it works.
All of these are jit by the driver in the end though?
Not exactly, both cl and glsl can be aot, but the runtime will be limited to some newer version and then the market coverage will be niche, those vendor are so lazy on updating the driver and fixing the compiler bugs...
Having worked for a few of those lazy vendors, the aot usually just ends up being bitcode which is fully jitted later.