|
|
|
|
|
by reitzensteinm
176 days ago
|
|
It's a good call out re: tokens vs letters, but I think you might have misunderstood my point - you can't do it a token at a time unless the intermediate KV cache is stored after each token is generated. This won't be the case in any non toy implementation, as it would be unneccessary and slow. |
|