| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Gracana 117 days ago
	There's a reddit comment here https://www.reddit.com/r/LocalLLaMA/comments/1r4m4it/comment... that says: my system is running GLM-5 MXFP4 at about 17 tok/s. That’s with a single RTX Pro 6000 on an EPYC 9455P with 12 channels of DDR5-6400. Only 16k context though, since it’s too slow to use for programming anyway and that’s the only application where I need big context.