Hacker News new | ask | show | jobs
by AeiumNE 1165 days ago
It's possible to run with a cpu and then use system memory.
1 comments

Do you know what kind of speed you can expect with inference on CPU?
Yes, around 2 tokens per second on a newer CPU with fast RAM.