|
|
|
|
|
by pests
53 days ago
|
|
> And it gets churned by every single request they receive. Not true, it gets calculated once and essentially baked into initial state basically and gets stored in a standard K/V prefix cache. Processing only happens on new input (minus attention which will have to content with tokens from the prompt) |
|