| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by milgrum 281 days ago
	How many TPS do you get running GPT OSS 120b on the 395+? Considering a Framework desktop for a similar use case, but I’ve been reading mixed things about performance (specifically with regards to memory bandwidth, but I’m not sure if that’s really the underlying issue)

1 comments

30-40 at 64k context, but it's a mixture of experts model.

A 70b dense model is slower

Qwen coder 30b Q4 runs 40+.