|
|
|
|
|
by adgjlsfhk1
1532 days ago
|
|
you might be surprised. I don't know a lot about how it generates gpu kernels, but for CPU it uses LoopVectorization which is able to (among other things) derive optimal gemm kernels automatically. For GPU, I believe tullio used KernelAbstractions.jl to generate the kernels, but I'm less familiar with the details. |
|