Hacker News new | ask | show | jobs
by alfor 1165 days ago
Do you know what kind of speed you can expect with inference on CPU?
1 comments

Yes, around 2 tokens per second on a newer CPU with fast RAM.