| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pk-protect-ai 916 days ago
	I did some minimal testing, mamba uses about 60% of VRAM in comparison to RetNet (parallel forward mode) with the model of the same size and the vocabulary of same size during inference.