| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by almostgotcaught 739 days ago
	some people emit llvm ir (maaaaybe ptx) directly instead of using the C/C++ frontend to CUDA. that's absolutely the only optional part of the stack and also basically the most trivial (i.e., it's not the frontend that's hard but the target codegen).

1 comments

sudosysgen 739 days ago

LLVM IR to machine code is not the part that AMD has traditionally struggled with. What you call "trivial" is. If everyone started emitting IR and didn't rely on NVidia-owned libs then the space would become unrecognizable. The codegen is something AMD has always been decent at, hence them beating NVidia in compute benchmarks for most of the past 20 years.

link

almostgotcaught 739 days ago

> LLVM IR to machine code is not the part that AMD has traditionally struggled with.

alright fine it's the codegen and the runtime and the driver and the library ecosystem...

> If everyone started emitting IR and didn't rely on NVidia-owned libs then the space would become unrecognizable.

I have no clue what this means - which libs are you talking about here? the libs that contain the implementations of their runtime? or the libs that contain the user space components of their driver? or the libs that contain their driver and firmware code? And exactly which of these will "everyone emitting IR" save us from?

link

sudosysgen 739 days ago

I am talking about user and user-level libraries, so from PyTorch to cuBLAS. The rest is currently serviceable and at time was even slightly better than NVidia. If people start shipping code that targets, say, LLVM IR (that then gets converted to PTX or whatever), like one would do using SYCL, we only have to rely the bare minimum.

link

imtringued 739 days ago

AMD is struggling with unsafe C and C++ code breaking their drivers.

link