| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by earhart 1099 days ago

IMHO, it comes down to the software.

It turns out you need very different kernels for good performance on different GPUs, so OpenCL is a nice tool, but not sufficient; you need a hardware-specific kernel library.

From the framework side, each integration is relatively expensive to support, so you really don’t want to invest in many of them. Without some sort of kernel API standard, you’re into a proprietary solution, and NVidia did an amazing job at investing in their software, so that’s the way things go.

I think we had a pretty solid foundation for doing something smarter with PlaidML, but after we were bought by Intel, some architectural decisions and some business decisions consigned that to be a research project; I don’t know that it’s going anywhere.

These days, I’d probably look into OctoML / TVM, or maybe Modular, for a better solution in this space… or just buy NVidia.

(I worked a bit on Intel’s Meteor Lake VPU; it’s a lovely machine, but I’m not sure what the story will be for general framework integrations. I bet OpenVINO will run really well on it, though :-)