Y
Hacker News
new
|
ask
|
show
|
jobs
by
throwawayffffas
10 days ago
He meant prompt eval time, but have a look at these guys:
https://www.youtube.com/watch?v=ndSA9T5yvmM
Over 2500 tokens per second on a single request. With 8 MI300X.