Hacker News new | ask | show | jobs
by PUSH_AX 1006 days ago
You think they are caching? Even though one of the parameters is temperature? Can of worms, and should be reflected in the pricing if true, don't get me started if they are charging per token for cached responses.

I just don't see it.

1 comments

You can keep around the KV cache from previous generations which lowers the cost of prompts significantly.