Hacker News new | ask | show | jobs
by simlevesque 930 days ago
How fast are these small models on a 4090, is it like 100ms ? 500ms ?
1 comments

Mistral-7B gives you 80 tokens/second on 4090. So this one will be faster...
It’s about twice the speed