Y
Hacker News
new
|
ask
|
show
|
jobs
by
esafak
26 days ago
It matters on consumer hardware since barely any model runs at reasonable speed.
1 comments
gaeld
26 days ago
It also matters for thinking models and for agentic workflows, especially in software engineering, where a lot of tokens need to be output in iterative loops before the user sees any result.
This is our main use case.
link
This is our main use case.