|
|
|
|
|
by noxa
1574 days ago
|
|
Correct! We're currently compiling to SPIR-V and then going through MoltenVK/spirv-cross to get to Metal - and our SPIR-V needs quite a bit of work (currently tuned for Mali, which is a very different architecture). The biggest limiter though is as you note Apple doesn't have a way to use the fancy hardware instructions whereas our CUDA version is using the Tensor Cores. The good news (for us) is that this is all effectively -O1 today; there's still potential for 2-8x speedups over these numbers and that's not even factoring in the inaccessible HW features that maybe one day Apple will expose :crossed-fingers: :) // IREE dev |
|