Hacker News new | ask | show | jobs
by FL33TW00D 1060 days ago
ONNX is bloated! I got some LLMs working on my own Rust + WebGPU framework a few months ago: https://summize.fleetwood.dev/

I've since moved away from ONNX and to a more GGML style.

2 comments

Do you have any good resources or links on using ggml with wasm?
I think the Whisper example is your best bet! https://github.com/ggerganov/whisper.cpp/tree/master/example...
Hey! This is what I've been working on, would love to chat, feel free to email
Sure! My email is in my profile.
what's the difference between onnx and ggml style?
ONNX consumes a .onnx file, which is a definition of the network and weights. GGML instead just consumes the weights, and defines the network in code.

Being bound to ONNX means moving at a slower velocity - the field moves so fast that you need complete control.

I haven't used ONNX or GGML, but presumably using GGML means you need to reimplement the network architecture?
You do! But it offers quite a fluid API making it pretty simple. You can see my attempt at a torchesque API here: https://twitter.com/fleetwood___/status/1679889450623459328