Hacker News new | ask | show | jobs
by robrenaud 321 days ago
I think your core misunderstanding is that you are assuming K calls to generate 1 token is expensive as 1 call to generate K tokens. It is actually much more expensive to generate serially than even in small batches.