| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by treprinum 929 days ago
	I would say so based on LLaMA 2 70B; if it's 8x inference in MoE then I guess you'd see <20 tokens/sec?