|
|
|
|
|
by cjbprime
842 days ago
|
|
I think the answer's in the original question: the provider has to pay for extra storage to cache the model state at the prompt you're asking to snapshot. But it's not necessarily a net increase in costs for the provider, because in exchange for doing so they (as well as you) are getting to avoid many expensive inference rounds. |
|