| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by FL33TW00D 1108 days ago
	ONNX is bloated! I got some LLMs working on my own Rust + WebGPU framework a few months ago: https://summize.fleetwood.dev/ I've since moved away from ONNX and to a more GGML style.

2 comments

naillo 1108 days ago

Do you have any good resources or links on using ggml with wasm?

link

FL33TW00D 1108 days ago

I think the Whisper example is your best bet! https://github.com/ggerganov/whisper.cpp/tree/master/example...

link

bkitano19 1108 days ago

Hey! This is what I've been working on, would love to chat, feel free to email

link

FL33TW00D 1108 days ago

Sure! My email is in my profile.

link

taminka 1108 days ago

what's the difference between onnx and ggml style?

link

FL33TW00D 1108 days ago

ONNX consumes a .onnx file, which is a definition of the network and weights. GGML instead just consumes the weights, and defines the network in code.

Being bound to ONNX means moving at a slower velocity - the field moves so fast that you need complete control.

link

michaelmior 1108 days ago

I haven't used ONNX or GGML, but presumably using GGML means you need to reimplement the network architecture?

link

FL33TW00D 1108 days ago

You do! But it offers quite a fluid API making it pretty simple. You can see my attempt at a torchesque API here: https://twitter.com/fleetwood___/status/1679889450623459328

link