| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by junrushao1994 1087 days ago

> Can you comment on how difficult it was to achieve this, and what the relative advantages b/w cards?

Thanks for asking! I personally believe TVM Unity is a proper software stack for ML compilation (MLC), and its existing optimizations (e.g. TensorCore offloading) can be transparently transferred to AMD/Intel/Apple/mobile GPUs without too much engineering effort.

Of course my claim is limited to ML workloads. Not an expert outside the ML world, so I couldn't say for general HPC.