Y
Hacker News
new
|
ask
|
show
|
jobs
by
chihuahua
907 days ago
> the quality of the response isn't really what we're looking for here. We're looking for speed i.e. tokens per second.
But if it was generating high-quality responses, would that not make it go slower?
1 comments
nomel
907 days ago
That would involve using a different model. This is not about the model, it’s about the relative speed improvement from the hardware, with this model as a demo.
link