| > AMD fucked up by not having a stable IR between GPU generations The lack of a stable IR is probably deliberate. Much like the "we won't support DLLs or pluggable APIs, only statically compiling it into your application" with FSR2, once you port to HIP you're locked in. AMD wants you working in HIP, compiling from HIP, not treating them as an IR - they don't want to be an alternate runtime for NVIDIA's ecosystem. And again, much like FSR2, they are in fact willing to compromise end-user experience (no updates) or developer convenience (continual patching) in order to do it. No libraries, only distribute as source, ever. It's not about library pluggability or runtime compatibility (after all GPU Ocelot already existed), what they want is you building the ROCm Ecosystem and not the CUDA Ecosystem or OneAPI Ecosystem. That's understandable from a corporate strategy perspective, as a corporation you don't want to be building a product on someone else's platform, because that gives a lot of freedom for the platform owner to fuck with you. But like, the whole "we won't even do libraries/IR" is a little crass from a customer experience/developer experience perspective, and it kinda goes against the whole good-guy-AMD mythos they've built up. |
The problem is that using machine code makes it machine-specific. So if I compile a HIP program for my gfx803 (RX 580) card, I won't be able to run the same binary on someone else's 6800 XT (gfx103x) system. (I think technically you can put both in a single binary, but that's still a terrible solution).
CUDA instead ships NVPTX, which is an IR that can be compiled by the driver to machine code as long as the GPU has the right compute capability, similar to how it works in the graphics world (you submit your GLSL/HLSL source code or SPIR bytecode) to the driver which compiles it for the right GPU.
Intel's oneAPI/Level Zero API ships SPIR-V, afaik (or maybe regular SPIR?). oneAPI can also work on top of OpenCL instead of L0. SPIR-V is neat because it's an open standard, so in theory you can get L0 working on non-Intel GPUs (and iirc Intel also uses it for e.g. FPGA's). But both SPIR-V and NVPTX solve the "machine-specific" problem AMD has.