|
|
|
|
|
by CjHuber
139 days ago
|
|
It depends on the API path. Chat completions does what you describe, however isn't it legacy? I've only used codex with the responses v1 API and there it's the complete opposite. Already generated reasoning tokens even persist when you send another message (without rolling back) after cancelling turns before they have finished the thought process Also with responses v1 xhigh mode eats through the context window multiples faster than the other modes, which does check out with this. |
|
The docs are a bit misleading/opaque, but essentially reasoning persists for multiple sequential assistant turns, but is discarded upon the next user turn[0].
The diagram on that page makes it pretty clear, as does the section on caching.
[0]https://cookbook.openai.com/examples/responses_api/reasoning...