Hacker News new | ask | show | jobs
by noxa 1574 days ago
Correct! We're currently compiling to SPIR-V and then going through MoltenVK/spirv-cross to get to Metal - and our SPIR-V needs quite a bit of work (currently tuned for Mali, which is a very different architecture). The biggest limiter though is as you note Apple doesn't have a way to use the fancy hardware instructions whereas our CUDA version is using the Tensor Cores.

The good news (for us) is that this is all effectively -O1 today; there's still potential for 2-8x speedups over these numbers and that's not even factoring in the inaccessible HW features that maybe one day Apple will expose :crossed-fingers: :)

// IREE dev

1 comments

Any chance of using the Neural Engine (otherwise exposed by CoreML) ?
unlikely since the interface from ANE is not public and it may change between hardware versions.