| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by smcnally 456 days ago

`model.safetensors` for Qwen3-0.6B is a single 1.5GB file.

Qwen3-235B-A22B has 118 `.safetensors` files at 4GB each.

There are a bunch of models and quants between those.

1 comments

Does it run in 8x80G? Or does the KV cache and other buffers push it over the edge?