Hacker News new | ask | show | jobs
Making PyTorch –> Qualcomm NPUs less treacherous (muna.ai)
1 points by olokobayusuf 108 days ago
1 comments

There are over 2.5 billion Qualcomm processors in the world today (PC, mobile, automotive, etc). But the process for bringing AI models to run on Qcom processors is a (massive) pain. Their 2GB+ SDK is an encyclopedia's worth of information needed to deploy correctly.

We're working to make Qualcomm NPUs a first-class citizen for deployment from PyTorch. Devs can write a Python function that runs a PyTorch model, then use our `@compile` decorator to transpile the model to a Qcom-specific C++ implementation (DLC) which compiles to a self-contained shared library.

The Qualcomm NPUs are fast. 1.8x faster than ONNXRuntime. See the link above.