|
|
|
|
|
by JonChesterfield
701 days ago
|
|
Huh. Inline assembly is strongly associated in my mind with writing things that can't be represented in LLVM IR, but in the specific case of PTX - you can only write things that ptxas understands, and that probably rules out wide classes of horrendous behaviour. Raw bytes being used for instructions and for data, ad hoc self modifying code and so forth. I believe nvcc is roughly an antique clang build hacked out of all recognition. I remember it rejecting templates with 'I' as the type name and working when changing to 'T', nonsense like that. The HIP language probably corresponds pretty closely to clang's cuda implementation in terms of semantics (a lot of the control flow in clang treats them identically), but I don't believe an exact match to nvcc was considered particularly necessary for the clang -x cuda work. The ptx to llvm IR approach is clever. I think upstream would be game for that, feel free to tag me on reviews if you want to get that divergence out of your local codebase. |
|