|
|
|
|
|
by com2kid
297 days ago
|
|
Even 40 tokens per second is plenty enough for real time usage. The average person reads at ~4 words per second, 40 tokens per second is going to be 15-20 words per second. Even useful models like gemma3 27b are hitting 22 t/s on 4bit quants. You aren't going to be reformatting gigabytes of PDFs or anything, but for a lot of common use cases, those speeds are fine. |
|