Y
Hacker News
new
|
ask
|
show
|
jobs
by
robrenaud
321 days ago
I think your core misunderstanding is that you are assuming K calls to generate 1 token is expensive as 1 call to generate K tokens. It is actually much more expensive to generate serially than even in small batches.