| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by thtmnisamnstr 149 days ago
	The general rule to follow is that you need as much VRAM as the model size. 30b models are usually around 19GB. So, most likely a GPU with 24GB of VRAM.

1 comments

3836293648 148 days ago

But this also means tiny context windows. You can't fit gpt-oss:20b + more than a tiny file + instructions into 24GB

link

blizdiddy 148 days ago

Gpt-oss is natively 4-bit, so you kinda can

link

3836293648 146 days ago

You can fit the weights + a tiny context window into 24GB, absolutely. But you can't fit anything of any reasonable size. Or Ollama's implementation is broken, but it needs to be restricted beyond usability for it not to freeze up the entire machine when I last tried to use it.

link