Hacker News new | ask | show | jobs
by electroglyph 267 days ago
i exclusively use ONNX models across platforms for CPU inference. it's usually the fastest option on CPU. hacking on ONNX graphs is super easy, too...i make my own uint8 output ONNX embedding models