| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by t55 491 days ago
	Triton sits between CUDA and PyTorch and is built to work smoothly within the PyTorch ecosystem. In CUDA, on the other hand, you can directly manipulate warp-level primitives and fine-tune memory prefetching to reduce latency in eg. attention algorithms, a level of control that Triton and PyTorch don't offer AFAIK.

1 comments

pjmlp 490 days ago

MLIR extensions for Python do though, as far as I could tell from LLVM developer meeting.

link

6gvONxR4sf7o 490 days ago

MLIR is one of those things everyone seems to use, but nobody seems to want to write solid introductory docs for :(

I've been curious for a few years now to get into MLIR, but I don't know compilers or LLVM, and all the docs I've found seem to assume knowledge of one or the other.

(yes this is a plea for someone to write an 'intro to compilers' using MLIR)

link

pjmlp 490 days ago

Not sure if you will be able to follow along, but here it is what I was talking about,

"PyDSL: A MLIR DSL for Python developers"

https://www.youtube.com/watch?v=iYLxgTRe8TU

"PyDSL, a subset of Python for constructing affine & transform dialects"

https://www.youtube.com/watch?v=nmtHeRkl850

And MLIR channel,

https://www.youtube.com/@MLIRCompiler

link