| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Tepix 27 days ago
	Kimi K2.6 does not run well on 256GB.

2 comments

zozbot234 27 days ago

Have you tried it? It would be slow for sure, but the main limitation AIUI would actually be storing the context in RAM - models like Kimi and GLM have high demands there which limit your ability to get meaningful aggregate throughput via large batches.

link

Tepix 25 days ago

No need to try really. 1100b weights with 256GB RAM that‘s less than 1.8 bits per weight if you want a little bit of context.

How is that supposed to give good results?

link

girvo 27 days ago

True, I might be thinking of some of the communities four-Spark clusters for it; it’s already int4 right?

link

Tepix 25 days ago

Yeah, the default quants are 595GB. Even four Sparks would require a quant lower than 4bit

link