Hacker News new | ask | show | jobs
by pk-protect-ai 916 days ago
I did some minimal testing, mamba uses about 60% of VRAM in comparison to RetNet (parallel forward mode) with the model of the same size and the vocabulary of same size during inference.