How it calculates the price? I thought that once you load the content (32k token request / 2$) it will remember the context so you can ask questions much cheaper.
It does not have memory outside the context window. If you want to have a back-and-forth with it about a document, that document must be provided in the context (along with your other relevant chat history) with every request.
This is why it's so easy to burn up lots of tokens very fast.
This is why it's so easy to burn up lots of tokens very fast.