Hacker News new | ask | show | jobs
by remexre 176 days ago
For each token generated, you only send one token’s worth between layers; the previous tokens are in the KV cache.