|
|
|
|
|
by embedding-shape
202 days ago
|
|
> I run it all the time, token generation is pretty good. I feel like because you didn't actually talk about prompt processing speed or token/s, you aren't really giving the whole picture here. What is the prompt processing tok/s and the generation tok/s actually like? |
|